Developing an ensemble machine learning study: Insights from a multi-center proof-of-concept study-Reference-Cited by-同舟云学术

Developing an ensemble machine learning study: Insights from a multi-center proof-of-concept study

Published:2024-09-10 Issue:9 Volume:19 Page:e0303217
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Fanizzi Annarita^ORCID,Fadda Federico^ORCID,Maddalo Michele,Saponaro Sara,Lorenzon Leda,Ubaldi Leonardo,Lambri Nicola^ORCID,Giuliano Alessia,Loi Emiliano,Signoriello Michele^ORCID,Branchini Marco,Belmonte Gina,Giannelli Marco,Mancosu Pietro,Talamonti Cinzia^ORCID,Iori Mauro,Tangaro Sabina^ORCID,Avanzo Michele,Massafra Raffaella

Abstract

Background To address the numerous unmeet clinical needs, in recent years several Machine Learning models applied to medical images and clinical data have been introduced and developed. Even when they achieve encouraging results, they lack evolutionary progression, thus perpetuating their status as autonomous entities. We postulated that different algorithms which have been proposed in the literature to address the same diagnostic task, can be aggregated to enhance classification performance. We suggested a proof of concept to define an ensemble approach useful for integrating different algorithms proposed to solve the same clinical task. Methods The proposed approach was developed starting from a public database consisting of radiomic features extracted from CT images relating to 535 patients suffering from lung cancer. Seven algorithms were trained independently by participants in the AI4MP working group on Artificial Intelligence of the Italian Association of Physics in Medicine to discriminate metastatic from non-metastatic patients. The classification scores generated by these algorithms are used to train SVM classifier. The Explainable Artificial Intelligence approach is applied to the final model. The ensemble model was validated following an 80–20 hold-out and leave-one-out scheme on the training set. Results Compared to individual algorithms, a more accurate result was achieved. On the independent test the ensemble model achieved an accuracy of 0.78, a F1-score of 0.57 and a log-loss of 0.49. Shapley values representing the contribution of each algorithm to the final classification result of the ensemble model were calculated. This information represents an added value for the end user useful for evaluating the appropriateness of the classification result on a particular case. It also allows us to evaluate on a global level which methodological approaches of the individual algorithms are likely to have the most impact. Conclusion Our proposal represents an innovative approach useful for integrating different algorithms that populate the literature and which lays the foundations for future evaluations in broader application scenarios.

Funder

Ministero della Salute

Publisher

Public Library of Science (PLoS)

Reference55 articles.

1. Artificial intelligence in oncology;H. Shimizu;Cancer science,2020

2. A machine learning ensemble approach for 5-and 10-year breast cancer invasive disease event classification;R. Massafra;Plos one,2022

3. A clinical decision support system for predicting invasive breast cancer recurrence: preliminary results.;R. Massafra;Frontiers in Oncology,2021

4. Network-based machine learning and graph theory algorithms for precision oncology;W. Zhang;NPJ precision oncology,2017

5. A semi-hard voting combiner scheme to ensemble multi-class probabilistic classifiers;R. Delgado;Applied Intelligence,2022