Optimizing Skin Cancer Survival Prediction with Ensemble Techniques
-
Published:2023-12-31
Issue:1
Volume:11
Page:43
-
ISSN:2306-5354
-
Container-title:Bioengineering
-
language:en
-
Short-container-title:Bioengineering
Author:
Abbasi Erum Yousef1, Deng Zhongliang1, Magsi Arif Hussain2ORCID, Ali Qasim3ORCID, Kumar Kamlesh4, Zubedi Asma5
Affiliation:
1. State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China 3. Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro 76062, Pakistan 4. School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China 5. School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China
Abstract
The advancement in cancer research using high throughput technology and artificial intelligence (AI) is gaining momentum to improve disease diagnosis and targeted therapy. However, the complex and imbalanced data with high dimensionality pose significant challenges for computational approaches and multi-omics data analysis. This study focuses on predicting skin cancer and analyzing overall survival probability. We employ the Kaplan–Meier estimator and Cox proportional hazards regression model, utilizing high-throughput machine learning (ML)-based ensemble methods. Our proposed ML-based ensemble techniques are applied to a publicly available dataset from the ICGC Data Portal, specifically targeting skin cutaneous melanoma cancers (SKCM). We used eight baseline classifiers, namely, random forest (RF), decision tree (DT), gradient boosting (GB), AdaBoost, Gaussian naïve Bayes (GNB), extra tree (ET), logistic regression (LR), and light gradient boosting machine (Light GBM or LGBM). The study evaluated the performance of the proposed ensemble methods and survival analysis on SKCM. The proposed methods demonstrated promising results, outperforming other algorithms and models in terms of accuracy compared to traditional methods. Specifically, the RF classifier exhibited outstanding precision results. Additionally, four different ensemble methods (stacking, bagging, boosting, and voting) were created and trained to achieve optimal results. The performance was evaluated and interpreted using accuracy, precision, recall, F1 score, confusion matrix, and ROC curves, where the voting method achieved a promising accuracy of 99%. On the other hand, the RF classifier achieved an outstanding accuracy of 99%, which exhibits the best performance. We compared our proposed study with the existing state-of-the-art techniques and found significant improvements in several key aspects. Our approach not only demonstrated superior performance in terms of accuracy but also showcased remarkable efficiency. Thus, this research work contributes to diagnosing SKCM with high accuracy.
Reference38 articles.
1. Wang, X., Xiong, H., Liang, D., Chen, Z., Li, X., and Zhang, K. (2020). The role of SRGN in the survival and immune infiltrates of skin cutaneous melanoma (SKCM) and SKCM-metastasis patients. BMC Cancer, 20. 2. Ervik, F., Ferlay, J., Mery, L., Soerjomataram, I., and Bray, F. (2017). Cancer Today, International Agency for Research on Cancer. 3. World Health Organization (2023). World Health Statistics, World Health Organization. Visual Summary. 4. Cutaneous malignant melanoma: A review of early diagnosis and management;Naik;World J. Oncol.,2021 5. Epidemiology and risk factors of melanoma;Carr;Surg. Clin.,2020
|
|