Reader bias in breast cancer screening related to cancer prevalence and artificial intelligence decision support

Reader bias in breast cancer screening related to cancer prevalence and artificial intelligence decision support—a reader study

Published:2024-01-02 Issue: Volume: Page:
ISSN:1432-1084
Container-title:European Radiology
language:en
Short-container-title:Eur Radiol

Author:

Al-Bazzaz Hanen,Janicijevic Marina,Strand Fredrik^ORCID

Abstract

Abstract Objectives The aim of our study was to examine how breast radiologists would be affected by high cancer prevalence and the use of artificial intelligence (AI) for decision support. Materials and method This reader study was based on selection of screening mammograms, including the original radiologist assessment, acquired in 2010 to 2013 at the Karolinska University Hospital, with a ratio of 1:1 cancer versus healthy based on a 2-year follow-up. A commercial AI system generated an exam-level positive or negative read, and image markers. Double-reading and consensus discussions were first performed without AI and later with AI, with a 6-week wash-out period in between. The chi-squared test was used to test for differences in contingency tables. Results Mammograms of 758 women were included, half with cancer and half healthy. 52% were 40–55 years; 48% were 56–75 years. In the original non-enriched screening setting, the sensitivity was 61% (232/379) at specificity 98% (323/379). In the reader study, the sensitivity without and with AI was 81% (307/379) and 75% (284/379) respectively (p < 0.001). The specificity without and with AI was 67% (255/379) and 86% (326/379) respectively (p < 0.001). The tendency to change assessment from positive to negative based on erroneous AI information differed between readers and was affected by type and number of image signs of malignancy. Conclusion Breast radiologists reading a list with high cancer prevalence performed at considerably higher sensitivity and lower specificity than the original screen-readers. Adding AI information, calibrated to a screening setting, decreased sensitivity and increased specificity. Clinical relevance statement Radiologist screening mammography assessments will be biased towards higher sensitivity and lower specificity by high-risk triaging and nudged towards the sensitivity and specificity setting of AI reads. After AI implementation in clinical practice, there is reason to carefully follow screening metrics to ensure the impact is desired. Key Points • Breast radiologists’ sensitivity and specificity will be affected by changes brought by artificial intelligence. • Reading in a high cancer prevalence setting markedly increased sensitivity and decreased specificity. • Reviewing the binary reads by AI, negative or positive, biased screening radiologists towards the sensitivity and specificity of the AI system. Graphical abstract

Funder

Karolinska Institute

Publisher

Springer Science and Business Media LLC

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

Link

https://link.springer.com/content/pdf/10.1007/s00330-023-10514-5.pdf

Reference16 articles.

1. Lång K, Hofvind S, Rodríguez-Ruiz A, Andersson I (2021) Can artificial intelligence reduce the interval cancer rate in mammography screening? Eur Radiol 31:5940–5947

2. Dembrower K, Salim M, Eklund M, Lindholm P, Strand F (2023) Implications for downstream workload based on calibrating an artificial intelligence detection algorithm by standalone-reader or combined-reader sensitivity matching. J Med Imaging (Bellingham) 10(S2):S22405–S22405

3. Yoon JH, Strand F, Baltzer PAT et al(2023) Standalone AI for breast cancer detection at screening digital mammography and digital breast tomosynthesis: a systematic review and meta-analysis. Radiology 307(5):e222639. https://doi.org/10.1148/radiol.222639

4. Salim M, Wåhlin E, Dembrower K et al (2020) External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 6(10):1581–1588. https://doi.org/10.1001/jamaoncol.2020.3321

5. Dembrower K, Wåhlin E, Liu Y (2020) Effect of artificial intelligence-based triaging of breast cancer screening mammograms on cancer detection and radiologist workload: a retrospective simulation study. Lancet Digit Health 2(9):e468–e474

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The unintended consequences of artificial intelligence and high-risk triaging;European Radiology;2024-01-04