Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach

Author:

Tore Ulan1,Abilgazym Aibek1,Asunsolo-del-Barco Angel234ORCID,Terzic Milan567ORCID,Yemenkhan Yerden8,Zollanvari Amin1,Sarria-Santamera Antonio9ORCID

Affiliation:

1. School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, Kazakhstan

2. Department of Surgery, Medical and Social Sciences, Faculty of Medicine, University of Alcalá, 288871 Alcalá de Henares, Spain

3. Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York (CUNY), New York, NY 10028, USA

4. Ramón y Cajal Institute of Healthcare Research (IRYCIS), 28034 Madrid, Spain

5. Department of Surgery, School of Medicine, Nazarbayev University, Astana 010000, Kazakhstan

6. Clinical Academic Department of Women’s Health, CF “University Medical Center”, Astana 010000, Kazakhstan

7. Department of Obstetrics, Gynecology and Reproductive Sciences, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA

8. Department of Medicine, School of Medicine, Nazarbayev University, Astana 010000, Kazakhstan

9. Department of Biomedical Sciences, School of Medicine, Nazarbayev University, Astana 010000, Kazakhstan

Abstract

Endometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with multiple diseases. We use a total of 627,566 clinically collected data from cases of endometriosis (0.82%) and controls (99.18%) to construct and evaluate predictive models. We develop a machine learning platform to construct diagnostic tools for endometriosis. The platform consists of logistic regression, decision tree, random forest, AdaBoost, and XGBoost for prediction, and uses Shapley Additive Explanation (SHAP) values to quantify the importance of features. In the model selection phase, the constructed XGBoost model performs better than other algorithms while achieving an area under the curve (AUC) of 0.725 on the test set during the evaluation phase, resulting in a specificity of 62.9% and a sensitivity of 68.6%. The model leads to a quite low positive predictive value of 1.5%, but a quite satisfactory negative predictive value of 99.58%. Moreover, the feature importance analysis points to age, infertility, uterine fibroids, anxiety, and allergic rhinitis as the top five most important features for predicting endometriosis. Although these results show the feasibility of using machine learning to improve the diagnosis of endometriosis, more research is required to improve the performance of predictive models for the diagnosis of endometriosis. This state of affairs is in part attributed to the complex nature of the condition and, at the same time, the administrative nature of our features. Should more informative features be used, we could possibly achieve a higher AUC for predicting endometriosis. As a result, we merely perceive the constructed predictive model as a tool to provide auxiliary information in clinical practice.

Funder

Nazarbayev University

Publisher

MDPI AG

Subject

General Biochemistry, Genetics and Molecular Biology,Medicine (miscellaneous)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3