StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records

Author:

Lee Ho-Joon1ORCID,Schwamm Lee H.2ORCID,Sansing Lauren3,Kamel Hooman4,de Havenon Adam3,Turner Ashby C.5,Sheth Kevin N.3ORCID,Krishnaswamy Smita6ORCID,Brandt Cynthia7,Zhao Hongyu8ORCID,Krumholz Harlan9ORCID,Sharma Richa3

Affiliation:

1. Department of Genetics and Yale Center for Genome Analysis, Yale School of Medicine, New Haven, CT

2. Department of Neurology and Comprehensive Stroke Center, Massachusetts General Hospital and Harvard Medical School Boston, MA; Department of Neurology, Yale School of Medicine, New Haven, CT

3. Department of Neurology, Yale School of Medicine, New Haven, CT

4. Department of Neurology, Weill Cornell Medicine, New York City, NY

5. Department of Neurology and Comprehensive Stroke Center, Massachusetts General Hospital and Harvard Medical School Boston, MA

6. Departments of Genetics and Computer Science, Yale School of Medicine, New Haven, CT

7. Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT

8. Departments of Biostatistics, Yale School of Public Health, New Haven, CT

9. Department of Internal Medicine, Yale School of Medicine, New Haven, CT

Abstract

AbstractDetermining the etiology of an acute ischemic stroke (AIS) is fundamental to secondary stroke prevention efforts but can be diagnostically challenging. We trained and validated an automated classification machine intelligence tool,StrokeClassifier, using electronic health record (EHR) text data from 2,039 non-cryptogenic AIS patients at 2 academic hospitals to predict the 4-level outcome of stroke etiology determined by agreement of at least 2 board-certified vascular neurologists’ review of the stroke hospitalization EHR.StrokeClassifieris an ensemble consensus meta-model of 9 machine learning classifiers applied to features extracted from discharge summary texts by natural language processing.StrokeClassifierwas externally validated in 406 discharge summaries from the MIMIC-III dataset reviewed by a vascular neurologist to ascertain stroke etiology. Compared with stroke etiologies adjudicated by vascular neurologists, nine base classifiers performed well with a mean cross-validated area under the receiver operating curve (AUCROC) of 0.90. Their ensemble meta-model,StrokeClassifier, achieved a mean cross-validated accuracy of 0.74 and weighted F1 of 0.74. In the MIMIC-III cohort, the accuracy and weighted F1 ofStrokeClassifierwere 0.70, and 0.71, respectively. SHapley Additive exPlanation analysis revealed that the top 5 features contributing to stroke etiology prediction were atrial fibrillation, age, middle cerebral artery occlusion, internal carotid artery occlusion, and frontal stroke location. We then designed a certainty heuristic to deem aStrokeClassifierdiagnosis as confidently non-cryptogenic by the degree of consensus among the 9 classifiers, and applied it to 788 cryptogenic patients. This reduced the percentage of the cryptogenic strokes from 25.2–7.2% of all ischemic strokes.StrokeClassifieris a validated artificial intelligence tool that rivals the performance of vascular neurologists in classifying ischemic stroke etiology for individual patients. With further training,StrokeClassifiermay have downstream applications including its use as a clinical decision support system.

Publisher

Research Square Platform LLC

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3