Skip to main content
Log in

Enhanced Artificial Intelligence Strategies in Renal Oncology: Iterative Optimization and Comparative Analysis of GPT 3.5 Versus 4.0

  • Urologic Oncology
  • Published:
Annals of Surgical Oncology Aims and scope Submit manuscript

Abstract

Background

The rise of artificial intelligence (AI) in medicine has revealed the potential of ChatGPT as a pivotal tool in medical diagnosis and treatment. This study assesses the efficacy of ChatGPT versions 3.5 and 4.0 in addressing renal cell carcinoma (RCC) clinical inquiries. Notably, fine-tuning and iterative optimization of the model corrected ChatGPT’s limitations in this area.

Methods

In our study, 80 RCC-related clinical questions from urology experts were posed three times to both ChatGPT 3.5 and ChatGPT 4.0, seeking binary (yes/no) responses. We then statistically analyzed the answers. Finally, we fine-tuned the GPT-3.5 Turbo model using these questions, and assessed its training outcomes.

Results

We found that the average accuracy rates of answers provided by ChatGPT versions 3.5 and 4.0 were 67.08% and 77.50%, respectively. ChatGPT 4.0 outperformed ChatGPT 3.5, with a higher accuracy rate in responses (p < 0.05). By counting the number of correct responses to the 80 questions, we then found that although ChatGPT 4.0 performed better (p < 0.05), both versions were subject to instability in answering. Finally, by fine-tuning the GPT-3.5 Turbo model, we found that the correct rate of responses to these questions could be stabilized at 93.75%. Iterative optimization of the model can result in 100% response accuracy.

Conclusion

We compared ChatGPT versions 3.5 and 4.0 in addressing clinical RCC questions, identifying their limitations. By applying the GPT-3.5 Turbo fine-tuned model iterative training method, we enhanced AI strategies in renal oncology. This approach is set to enhance ChatGPT’s database and clinical guidance capabilities, optimizing AI in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Availability of data and materials

All data generated and/or analyzed during this study are included in this published article.

References

  1. Hamet P, Tremblay J. Artificial intelligence in medicine. Metab Clinic Exp. 2017;69(Suppl):S36–40. https://doi.org/10.1016/j.metabol.2017.01.011.

    Article  CAS  Google Scholar 

  2. Tsakos E, Xydias EM, Ziogas AC, et al. Surgical and quality of life outcomes following robotic-assisted (da Vinci) laparoscopic repair of vesicovaginal fistula: a case report and video demonstration. Cureus. 2023;15(7):e42171. https://doi.org/10.7759/cureus.42171.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Sahu A, Mishra J, Kushwaha N. Artificial intelligence (AI) in drugs and pharmaceuticals. Comb Chem High Throughput Screen. 2022;25(11):1818–37. https://doi.org/10.2174/1386207325666211207153943.

    Article  CAS  PubMed  Google Scholar 

  4. Anta JA, Martinez-Ballestero I, Eiroa D, Garcia J, Rodriguez-Comas J. Artificial intelligence for the detection of pancreatic lesions. Int J Comput Assist Radiol Surg. 2022;17(10):1855–65. https://doi.org/10.1007/s11548-022-02706-z.

    Article  PubMed  Google Scholar 

  5. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. https://doi.org/10.1007/s10916-023-01925-4.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Moor M, Banerjee O, Abad ZSH, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616(7956):259–65. https://doi.org/10.1038/s41586-023-05881-4.

    Article  ADS  CAS  PubMed  Google Scholar 

  7. OpenAI. ChatGPT: Optimizing language models for dialogue. Available at: https://openai.com/blog/chatgpt. Accessed 1 Sep 2023.

  8. Yeo YH, Samaan JS, Ng WH, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol. 2023;29(3):721–32. https://doi.org/10.3350/cmh.2023.0089.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. https://doi.org/10.2196/46885.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. https://doi.org/10.2196/45312.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29(8):1930–40. https://doi.org/10.1038/s41591-023-02448-8.

    Article  CAS  PubMed  Google Scholar 

  12. Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73(1):17–48. https://doi.org/10.3322/caac.21763.

    Article  PubMed  Google Scholar 

  13. Chen DY, Uzzo RG. Evaluation and management of the renal mass. Med Clin North Am. 2011;95(1):179–89. https://doi.org/10.1016/j.mcna.2010.08.021.

    Article  PubMed  Google Scholar 

  14. Murai M, Oya M. Renal cell carcinoma: etiology, incidence and epidemiology. Curr Opin Urol. 2004;14(4):229–33. https://doi.org/10.1097/01.mou.0000135078.04721.f5.

    Article  PubMed  Google Scholar 

  15. Ljungberg B, Albiges L, Abu-Ghanem Y, et al. European association of urology guidelines on renal cell carcinoma: the 2022 update. Eur Urol. 2022;82(4):399–410. https://doi.org/10.1016/j.eururo.2022.03.006.

    Article  PubMed  Google Scholar 

  16. Rosiello G, Larcher A, Montorsi F, Capitanio U. Renal cancer: overdiagnosis and overtreatment. World J Urol. 2021;39(8):2821–3. https://doi.org/10.1007/s00345-021-03798-z.

    Article  PubMed  Google Scholar 

  17. Sedaghat S. Early applications of ChatGPT in medical practice, education and research. Clin Med. 2023;23(3):278–9. https://doi.org/10.7861/clinmed.2023-0078.

    Article  Google Scholar 

  18. Wu T, He S, Liu J, et al. A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J Automatica Sinica. 2023;10(5):1122–36. https://doi.org/10.1109/JAS.2023.123618.

    Article  Google Scholar 

  19. Ruksakulpiwat S, Kumar A, Ajibade A. Using ChatGPT in medical research: current status and future directions. J Multidiscip Healthc. 2023;16:1513–20. https://doi.org/10.2147/JMDH.S413470.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Capitanio U, Bensalah K, Bex A, et al. Epidemiology of renal cell carcinoma. Eur Urol. 2019;75(1):74–84. https://doi.org/10.1016/j.eururo.2018.08.036.

    Article  PubMed  Google Scholar 

  21. Campbell SC, Clark PE, Chang SS, Karam JA, Souter L, Uzzo RG. Renal mass and localized renal cancer: evaluation, management, and follow-up: AUA guideline: part I. J Urol. 2021;206(2):199–208. https://doi.org/10.1097/JU.0000000000001911.

    Article  PubMed  Google Scholar 

  22. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. https://doi.org/10.1371/journal.pdig.0000198.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Kitamura FC. ChatGPT Is shaping the future of medical writing but still requires human judgment. Radiology. 2023;307(2):e230171. https://doi.org/10.1148/radiol.230171.

    Article  PubMed  Google Scholar 

  24. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6:1169595. https://doi.org/10.3389/frai.2023.1169595.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgment

The authors gratefully thank the editors and reviewers for their constructive suggestions to improve this manuscript.

Funding

Funding for this study was received from the National Natural Science Foundation of China (82102171); Shenzhen Science and Technology Innovation Commission (RCJC20200714114557005); the National Natural Science Foundation of China (Tianyuan Fund for Mathematics: 12326610) and the Shenzhen Medical Research Fund (SMRF: A2302048).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualizing and designing the experiments: JH, SW and SZ. Analyzed the data: RL, AZ, LP and XX. Contributed reagents/materials/analysis: JZ, FW and FY. Wrote the manuscript: RL, AZ, SZ, LP and XX. All authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Shaohua Zhang PhD, Song Wu PhD or Jianquan Hou PhD.

Ethics declarations

Disclosure

Rui Liang, Anguo Zhao, Lei Peng, Xiaojian Xu, Jianye Zhong, Fan Wu, Fulin Yi, Shaohua Zhang, Song Wu, and Jianquan Hou declare they have no conflicts of interest in relation to this work.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Provenance and peer review

Not commissioned, externally peer reviewed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (TIFF 343 KB)

Supplementary file 2 (MP4 73340 KB)

Supplementary file 3 (XLSX 14 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, R., Zhao, A., Peng, L. et al. Enhanced Artificial Intelligence Strategies in Renal Oncology: Iterative Optimization and Comparative Analysis of GPT 3.5 Versus 4.0. Ann Surg Oncol (2024). https://doi.org/10.1245/s10434-024-15107-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1245/s10434-024-15107-0

Keywords

Navigation