Affiliation:
1. Lausanne University Hospital and University of Lausanne Lausanne Switzerland
2. City Hospital Waid Zurich Switzerland
3. L2F (Learn to Forecast) Lausanne Switzerland
4. University Hospital of Basel Basel Switzerland
Abstract
ObjectiveAutomated machine learning (autoML) platforms allow health care professionals to play an active role in the development of machine learning (ML) algorithms according to scientific or clinical needs. The aim of this study was to develop and evaluate such a model for automated detection and grading of distal hand osteoarthritis (OA).MethodsA total of 13,690 hand radiographs from 2,863 patients within the Swiss Cohort of Quality Management (SCQM) and an external control data set of 346 non‐SCQM patients were collected and scored for distal interphalangeal OA (DIP‐OA) using the modified Kellgren/Lawrence (K/L) score. Giotto (Learn to Forecast [L2F]) was used as an autoML platform for training two convolutional neural networks for DIP joint extraction and subsequent classification according to the K/L scores. A total of 48,892 DIP joints were extracted and then used to train the classification model. Heatmaps were generated independently of the platform. User experience of a web application as a provisional user interface was investigated by rheumatologists and radiologists.ResultsThe sensitivity and specificity of this model for detecting DIP‐OA were 79% and 86%, respectively. The accuracy for grading the correct K/L score was 75%, with a κ score of 0.76. The accuracy per DIP‐OA class differed, with 86% for no OA (defined as K/L scores 0 and 1), 71% for a K/L score of 2, 46% for a K/L score of 3, and 67% for a K/L score of 4. Similar values were obtained in an independent external test set. Qualitative and quantitative user experience testing of the web application revealed a moderate to high demand for automated DIP‐OA scoring among rheumatologists. Conversely, radiologists expressed a low demand, except for the use of heatmaps.ConclusionAutoML platforms are an opportunity to develop clinical end‐to‐end ML algorithms. Here, automated radiographic DIP‐OA detection is both feasible and usable, whereas grading among individual K/L scores (eg, for clinical trials) remains challenging.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献