Affiliation:
1. College of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK
2. Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
Abstract
Stroke poses a significant health threat, affecting millions annually. Early and precise prediction is crucial to providing effective preventive healthcare interventions. This study applied an ensemble machine learning and data mining approach to enhance the effectiveness of stroke prediction. By employing the cross-industry standard process for data mining (CRISP-DM) methodology, various techniques, including random forest, ExtraTrees, XGBoost, artificial neural network (ANN), and genetic algorithm with ANN (GANN) were applied on two benchmark datasets to predict stroke based on several parameters, such as gender, age, various diseases, smoking status, BMI, HighCol, physical activity, hypertension, heart disease, lifestyle, and others. Due to dataset imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied to the datasets. Hyperparameter tuning optimized the models via grid search and randomized search cross-validation. The evaluation metrics included accuracy, precision, recall, F1-score, and area under the curve (AUC). The experimental results show that the ensemble ExtraTrees classifier achieved the highest accuracy (98.24%) and AUC (98.24%). Random forest also performed well, achieving 98.03% in both accuracy and AUC. Comparisons with state-of-the-art stroke prediction methods revealed that the proposed approach demonstrates superior performance, indicating its potential as a promising method for stroke prediction and offering substantial benefits to healthcare.
Funder
Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University
Reference45 articles.
1. Machine learning and deep learning;Janiesch;Electron. Mark.,2021
2. World Stroke Organization (2022, October 10). Impact of Stroke. World Stroke Organization, 2024. Available online: https://www.world-stroke.org/world-stroke-day-campaign/about-stroke/impact-of-stroke.
3. Stroke Association (2022, October 10). Stroke Statistics | Stroke Association. Available online: https://www.stroke.org.uk/stroke/statistics.
4. Office for National Statistics (2022, October 10). Leading Causes of Death, UK—Office for National Statistics, Available online: https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/causesofdeath/articles/leadingcausesofdeathuk/2001to2018#strengths-and-limitations.
5. Stewart, C. (2022, October 10). Number of Inpatient Episodes with a Main Diagnosis of Stroke in the United Kingdom (UK) from 2011/12 to 2020/21*,” 2022. Available online: https://www.statista.com/statistics/1132426/hospital-admissions-for-stroke-in-the-uk/.