Weakly SupervisedMRISlice‐Level Deep Learning Classification of Prostate Cancer Approximates Full Voxel‐ and Slice‐Level Annotation: Effect of Increasing Training Set Size

Author:

Weißer Cedric12,Netzer Nils12,Görtz Magdalena34,Schütz Viktoria3,Hielscher Thomas5,Schwab Constantin6,Hohenfellner Markus3,Schlemmer Heinz‐Peter178,Maier‐Hein Klaus H.7910,Bonekamp David1278ORCID

Affiliation:

1. Division of Radiology German Cancer Research Center (DKFZ) Heidelberg Germany

2. Heidelberg University Medical School Heidelberg Germany

3. Department of Urology University of Heidelberg Medical Center Heidelberg Germany

4. Junior Clinical Cooperation Unit, Multiparametric Methods for Early Detection of Prostate Cancer German Cancer Research Center (DKFZ) Heidelberg Germany

5. Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany

6. Institute of Pathology University of Heidelberg Medical Center Heidelberg Germany

7. National Center for Tumor Diseases (NCT) Heidelberg Heidelberg Germany

8. German Cancer Consortium (DKTK) Germany

9. Medical Image Computing German Cancer Research Center (DKFZ) Heidelberg Germany

10. Pattern Analysis and Learning Group, Department of Radiation Oncology Heidelberg University Hospital Heidelberg Germany

Abstract

BackgroundWeakly supervised learning promises reduced annotation effort while maintaining performance.PurposeTo compare weakly supervised training with full slice‐wise annotated training of a deep convolutional classification network (CNN) for prostate cancer (PC).Study TypeRetrospective.SubjectsOne thousand four hundred eighty‐nine consecutive institutional prostate MRI examinations from men with suspicion for PC (65 ± 8 years) between January 2015 and November 2020 were split into training (N = 794, enriched with 204 PROSTATEx examinations) and test set (N = 695).Field Strength/Sequence1.5 and 3T, T2‐weighted turbo‐spin‐echo and diffusion‐weighted echo‐planar imaging.AssessmentHistopathological ground truth was provided by targeted and extended systematic biopsy. Reference training was performed using slice‐level annotation (SLA) and compared to iterative training utilizing patient‐level annotations (PLAs) with supervised feedback of CNN estimates into the next training iteration at three incremental training set sizes (N = 200, 500, 998). Model performance was assessed by comparing specificity at fixed sensitivity of 0.97 [254/262] emulating PI‐RADS ≥ 3, and 0.88–0.90 [231–236/262] emulating PI‐RADS ≥ 4 decisions.Statistical TestsReceiver operating characteristic (ROC) and area under the curve (AUC) was compared using DeLong and Obuchowski test. Sensitivity and specificity were compared using McNemar test. Statistical significance threshold wasP = 0.05.ResultsTest set (N = 695) ROC‐AUC performance of SLA (trained with 200/500/998 exams) was 0.75/0.80/0.83, respectively. PLA achieved lower ROC‐AUC of 0.64/0.72/0.78. Both increased performance significantly with increasing training set size. ROC‐AUC for SLA at 500 exams was comparable to PLA at 998 exams (P = 0.28). ROC‐AUC was significantly different between SLA and PLA at same training set sizes, however the ROC‐AUC difference decreased significantly from 200 to 998 training exams. Emulating PI‐RADS ≥ 3 decisions, difference between PLA specificity of 0.12 [51/433] and SLA specificity of 0.13 [55/433] became undetectable (P = 1.0) at 998 exams. Emulating PI‐RADS ≥ 4 decisions, at 998 exams, SLA specificity of 0.51 [221/433] remained higher than PLA specificity at 0.39 [170/433]. However, PLA specificity at 998 exams became comparable to SLA specificity of 0.37 [159/433] at 200 exams (P = 0.70).Data ConclusionWeakly supervised training of a classification CNN using patient‐level‐only annotation had lower performance compared to training with slice‐wise annotations, but improved significantly faster with additional training data.Evidence Level3Technical EfficacyStage 2

Publisher

Wiley

Subject

Radiology, Nuclear Medicine and imaging

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3