Developing the Benchmark: Establishing a Gold Standard for the Evaluation of AI Caries Diagnostics-Reference-Cited by-同舟云学术

Developing the Benchmark: Establishing a Gold Standard for the Evaluation of AI Caries Diagnostics

Published:2024-06-29 Issue:13 Volume:13 Page:3846
ISSN:2077-0383
Container-title:Journal of Clinical Medicine
language:en
Short-container-title:JCM

Author:

Boldt Julian¹,Schuster Matthias¹^ORCID,Krastl Gabriel²^ORCID,Schmitter Marc¹,Pfundt Jonas¹,Stellzig-Eisenhauer Angelika³,Kunz Felix³

Affiliation:

1. Department of Prosthetic Dentistry, University Hospital Würzburg, 97070 Würzburg, Germany

2. Center of Dental Traumatology, Department of Conservative Dentistry and Periodontology, University Hospital Würzburg, 97070 Würzburg, Germany

3. Department of Orthodontics, University Hospital Würzburg, 97070 Würzburg, Germany

Abstract

Background/Objectives: The aim of this study was to establish a histology-based gold standard for the evaluation of artificial intelligence (AI)-based caries detection systems on proximal surfaces in bitewing images. Methods: Extracted human teeth were used to simulate intraoral situations, including caries-free teeth, teeth with artificially created defects and teeth with natural proximal caries. All 153 simulations were radiographed from seven angles, resulting in 1071 in vitro bitewing images. Histological examination of the carious lesion depth was performed twice by an expert. A total of thirty examiners analyzed all the radiographs for caries. Results: We generated in vitro bitewing images to evaluate the performance of AI-based carious lesion detection against a histological gold standard. All examiners achieved a sensitivity of 0.565, a Matthews correlation coefficient (MCC) of 0.578 and an area under the curve (AUC) of 76.1. The histology receiver operating characteristic (ROC) curve significantly outperformed the examiners’ ROC curve (p < 0.001). All examiners distinguished induced defects from true caries in 54.6% of cases and correctly classified 99.8% of all teeth. Expert caries classification of the histological images showed a high level of agreement (intraclass correlation coefficient (ICC) = 0.993). Examiner performance varied with caries depth (p ≤ 0.008), except between E2 and E1 lesions (p = 1), while central beam eccentricity, gender, occupation and experience had no significant influence (all p ≥ 0.411). Conclusions: This study successfully established an unbiased dataset to evaluate AI-based caries detection on bitewing surfaces and compare it to human judgement, providing a standardized assessment for fair comparison between AI technologies and helping dental professionals to select reliable diagnostic tools.

Publisher

MDPI AG

Link

https://www.mdpi.com/2077-0383/13/13/3846/pdf

Reference61 articles.

1. Artificial Intelligence in Dentistry: Chances and Challenges;Schwendicke;J. Dent. Res.,2020

2. Ahmed, N., Abbasi, M.S., Zuberi, F., Qamar, W., Halim, M.S.B., Maqsood, A., and Alam, M.K. (2021). Artificial Intelligence Techniques: Analysis, Application, and Outcome in Dentistry-A Systematic Review. Biomed Res. Int., 2021.

3. Artificial intelligence in healthcare: Past, present and future;Jiang;Stroke Vasc. Neurol.,2017

4. Developments, application, and performance of artificial intelligence in dentistry—A systematic review;Khanagar;J. Dent. Sci.,2021

5. Cost-effectiveness of Artificial Intelligence for Proximal Caries Detection;Schwendicke;J. Dent. Res.,2021