Evaluation of Four Artificial Intelligence–Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study (Preprint)-Reference-Cited by-同舟云学术

Evaluation of Four Artificial Intelligence–Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study (Preprint)

Published:2020-02-06 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ćirković Aleksandar^ORCID

Abstract

BACKGROUND

Consumer-oriented mobile self-diagnosis apps have been developed using undisclosed algorithms, presumably based on machine learning and other artificial intelligence (AI) technologies. The US Food and Drug Administration now discerns apps with learning AI algorithms from those with stable ones and treats the former as medical devices. To the author’s knowledge, no self-diagnosis app testing has been performed in the field of ophthalmology so far.

OBJECTIVE

The objective of this study was to test apps that were previously mentioned in the scientific literature on a set of diagnoses in a deliberate time interval, comparing the results and looking for differences that hint at “nonlocked” learning algorithms.

METHODS

Four apps from the literature were chosen (Ada, Babylon, Buoy, and Your.MD). A set of three ophthalmology diagnoses (glaucoma, retinal tear, dry eye syndrome) representing three levels of urgency was used to simultaneously test the apps’ diagnostic efficiency and treatment recommendations in this specialty. Two years was the chosen time interval between the tests (2018 and 2020). Scores were awarded by one evaluating physician using a defined scheme.

RESULTS

Two apps (Ada and Your.MD) received significantly higher scores than the other two. All apps either worsened in their results between 2018 and 2020 or remained unchanged at a low level. The variation in the results over time indicates “nonlocked” learning algorithms using AI technologies. None of the apps provided correct diagnoses and treatment recommendations for all three diagnoses in 2020. Two apps (Babylon and Your.MD) asked significantly fewer questions than the other two (<i>P</i><.001).

CONCLUSIONS

“Nonlocked” algorithms are used by self-diagnosis apps. The diagnostic efficiency of the tested apps seems to worsen over time, with some apps being more capable than others. Systematic studies on a wider scale are necessary for health care providers and patients to correctly assess the safety and efficacy of such apps and for correct classification by health care regulating authorities.

Publisher

JMIR Publications Inc.

Reference80 articles.

1. Algorithms (and the) everyday

2. The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms

3. Leading the Digital Transformation of Healthcare

4. The potential for artificial intelligence in healthcare

5. Why digital medicine depends on interoperability