Outcome Prediction Using Multi-Modal Information: Integrating Large Language Model-Extracted Clinical Information and Image Analysis
Author:
Sun Di1ORCID, Hadjiiski Lubomir1, Gormley John1, Chan Heang-Ping1, Caoili Elaine1, Cohan Richard1, Alva Ajjai2, Bruno Grace1, Mihalcea Rada3, Zhou Chuan1ORCID, Gulani Vikas1ORCID
Affiliation:
1. Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA 2. Department of Internal Medicine-Hematology/Oncology, University of Michigan, Ann Arbor, MI 48109, USA 3. Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
Abstract
Survival prediction post-cystectomy is essential for the follow-up care of bladder cancer patients. This study aimed to evaluate artificial intelligence (AI)-large language models (LLMs) for extracting clinical information and improving image analysis, with an initial application involving predicting five-year survival rates of patients after radical cystectomy for bladder cancer. Data were retrospectively collected from medical records and CT urograms (CTUs) of bladder cancer patients between 2001 and 2020. Of 781 patients, 163 underwent chemotherapy, had pre- and post-chemotherapy CTUs, underwent radical cystectomy, and had an available post-surgery five-year survival follow-up. Five AI-LLMs (Dolly-v2, Vicuna-13b, Llama-2.0-13b, GPT-3.5, and GPT-4.0) were used to extract clinical descriptors from each patient’s medical records. As a reference standard, clinical descriptors were also extracted manually. Radiomics and deep learning descriptors were extracted from CTU images. The developed multi-modal predictive model, CRD, was based on the clinical (C), radiomics (R), and deep learning (D) descriptors. The LLM retrieval accuracy was assessed. The performances of the survival predictive models were evaluated using AUC and Kaplan–Meier analysis. For the 163 patients (mean age 64 ± 9 years; M:F 131:32), the LLMs achieved extraction accuracies of 74%~87% (Dolly), 76%~83% (Vicuna), 82%~93% (Llama), 85%~91% (GPT-3.5), and 94%~97% (GPT-4.0). For a test dataset of 64 patients, the CRD model achieved AUCs of 0.89 ± 0.04 (manually extracted information), 0.87 ± 0.05 (Dolly), 0.83 ± 0.06~0.84 ± 0.05 (Vicuna), 0.81 ± 0.06~0.86 ± 0.05 (Llama), 0.85 ± 0.05~0.88 ± 0.05 (GPT-3.5), and 0.87 ± 0.05~0.88 ± 0.05 (GPT-4.0). This study demonstrates the use of LLM model-extracted clinical information, in conjunction with imaging analysis, to improve the prediction of clinical outcomes, with bladder cancer as an initial example.
Funder
National Institutes of Health
Reference39 articles.
1. (2024, April 01). National Cancer Institute: SEER Cancer Stat Facts: Bladder Cancer. Bethesda, Md: National Cancer Institute, Available online: https://seer.cancer.gov/statfacts/html/urinb.html. 2. Sun, D., Hadjiiski, L., Gormley, J., Chan, H.-P., Caoili, E.M., Cohan, R.H., Alva, A., Gulani, V., and Zhou, C. (2023). Survival Prediction of Patients with Bladder Cancer after Cystectomy Based on Clinical, Radiomics, and Deep-Learning Descriptors. Cancers, 15. 3. The SPARC score: A multifactorial outcome prediction model for patients undergoing radical cystectomy for bladder cancer;Eisenberg;J. Urol.,2013 4. Zheng, Q., Yang, R., Ni, X., Yang, S., Xiong, L., Yan, D., Xia, L., Yuan, J., Wang, J., and Jiao, P. (2022). Accurate Diagnosis and Survival Prediction of Bladder Cancer Using Deep Learning on Histological Slides. Cancers, 14. 5. Combination of a novel gene expression signature with a clinical nomogram improves the prediction of survival in high-risk bladder cancer;Riester;Clin. Cancer Res.,2012
|
|