A critical assessment of using ChatGPT for extracting structured data from clinical notes-Reference-Cited by-同舟云学术

A critical assessment of using ChatGPT for extracting structured data from clinical notes

Published:2024-05-01 Issue:1 Volume:7 Page:
ISSN:2398-6352
Container-title:npj Digital Medicine
language:en
Short-container-title:npj Digit. Med.

Author:

Huang Jingwei^ORCID,Yang Donghan M.,Rong Ruichen,Nezafati Kuroush^ORCID,Treager Colin,Chi Zhikai^ORCID,Wang Shidan^ORCID,Cheng Xian,Guo Yujia,Klesse Laura J.,Xiao Guanghua,Peterson Eric D.,Zhan Xiaowei,Xie Yang^ORCID

Abstract

AbstractExisting natural language processing (NLP) methods to convert free-text clinical notes into structured data often require problem-specific annotations and model training. This study aims to evaluate ChatGPT’s capacity to extract information from free-text medical notes efficiently and comprehensively. We developed a large language model (LLM)-based workflow, utilizing systems engineering methodology and spiral “prompt engineering” process, leveraging OpenAI’s API for batch querying ChatGPT. We evaluated the effectiveness of this method using a dataset of more than 1000 lung cancer pathology reports and a dataset of 191 pediatric osteosarcoma pathology reports, comparing the ChatGPT-3.5 (gpt-3.5-turbo-16k) outputs with expert-curated structured data. ChatGPT-3.5 demonstrated the ability to extract pathological classifications with an overall accuracy of 89%, in lung cancer dataset, outperforming the performance of two traditional NLP methods. The performance is influenced by the design of the instructive prompt. Our case analysis shows that most misclassifications were due to the lack of highly specialized pathology terminology, and erroneous interpretation of TNM staging rules. Reproducibility shows the relatively stable performance of ChatGPT-3.5 over time. In pediatric osteosarcoma dataset, ChatGPT-3.5 accurately classified both grades and margin status with accuracy of 98.6% and 100% respectively. Our study shows the feasibility of using ChatGPT to process large volumes of clinical notes for structured information extraction without requiring extensive task-specific human annotation and model training. The results underscore the potential role of LLMs in transforming unstructured healthcare data into structured formats, thereby supporting research and aiding clinical decision-making.

Funder

U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences

U.S. Department of Health & Human Services | NIH | National Cancer Institute

U.S. Department of Health & Human Services | NIH | National Institute of Dental and Craniofacial Research

Cancer Prevention and Research Institute of Texas

Division of Intramural Research, National Institute of Allergy and Infectious Diseases

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41746-024-01079-8.pdf

Reference32 articles.

1. Vaswani, A. et al. Attention is all you need. Adv. Neural Info. Processing Syst. 30, (2017).

2. Devlin, J. et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

3. Radford, A. et al. Improving language understanding by generative pre-training. OpenAI: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).

4. Touvron, H. et al. LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).

5. OpenAi, GPT-4 Technical Report. arXiv:2303.08774: https://arxiv.org/pdf/2303.08774.pdf (2023).

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. From Text to Data: Automatically Extracting Data From Catheterization Reports Using Generative Artificial Intelligence;Journal of the Society for Cardiovascular Angiography & Interventions;2024-09

2. Accuracy, Consistency, and Hallucination of Large Language Models When Analyzing Unstructured Clinical Notes in Electronic Medical Records;JAMA Network Open;2024-08-13

3. Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review;2024-08-12

4. Assessing the ability of ChatGPT to extract natural product bioactivity and biosynthesis data from publications;2024-08-02

5. A survey analysis of the adoption of large language models among pathologists;American Journal of Clinical Pathology;2024-07-27