Knowledge graph construction for heart failure using large language models with prompt engineering-Reference-Cited by-同舟云学术

Knowledge graph construction for heart failure using large language models with prompt engineering

Published:2024-07-02 Issue: Volume:18 Page:
ISSN:1662-5188
Container-title:Frontiers in Computational Neuroscience
language:
Short-container-title:Front. Comput. Neurosci.

Author:

Xu Tianhan,Gu Yixun,Xue Mantian,Gu Renjie,Li Bin,Gu Xiang

Abstract

IntroductionConstructing an accurate and comprehensive knowledge graph of specific diseases is critical for practical clinical disease diagnosis and treatment, reasoning and decision support, rehabilitation, and health management. For knowledge graph construction tasks (such as named entity recognition, relation extraction), classical BERT-based methods require a large amount of training data to ensure model performance. However, real-world medical annotation data, especially disease-specific annotation samples, are very limited. In addition, existing models do not perform well in recognizing out-of-distribution entities and relations that are not seen in the training phase.MethodIn this study, we present a novel and practical pipeline for constructing a heart failure knowledge graph using large language models and medical expert refinement. We apply prompt engineering to the three phases of schema design: schema design, information extraction, and knowledge completion. The best performance is achieved by designing task-specific prompt templates combined with the TwoStepChat approach.ResultsExperiments on two datasets show that the TwoStepChat method outperforms the Vanillia prompt and outperforms the fine-tuned BERT-based baselines. Moreover, our method saves 65% of the time compared to manual annotation and is better suited to extract the out-of-distribution information in the real world.

Publisher

Frontiers Media SA

Reference60 articles.

1. Healthcare knowledge graph construction: a systematic review of the state-of-the-art, open issues, and opportunities;Abu-Salih;J. Big Data,2023

2. Gpt-4 technical report;Achiam;arXiv,2023

3. “Large language models are few-shot clinical information extractors,”;Agrawal,2022

4. Flamingo: a visual language model for few-shot learning;Alayrac;Adv. Neural Inf. Process. Syst,2022

5. Publicly available clinical bert embeddings;Alsentzer;arXiv,2019