Evaluating and Enhancing Japanese Large Language Models for Genetic Counseling Support: A Comparative Study of Domain Adaptation Methods and Expert-Evaluated Dataset Development (Preprint)

Author:

Fukushima TakuyaORCID,Manabe MasaeORCID,Yada ShuntaroORCID,Wakamiya ShokoORCID,Yoshida AkikoORCID,Urakawa Yusaku,Maeda AkikoORCID,Kan ShigeyukiORCID,Takahashi Masayo,Aramaki EijiORCID

Abstract

BACKGROUND

The field of genetics has made significant advancements, revealing a strong correlation between genetics and health. Consequently, the demand for genetic counseling services to address genetic issues has increased. Consequently, the shortage of professionals in the realm of genetic counseling has posed a significant challenge. The emergence of large language models (LLMs) in recent years offers a potential solution to this issue. However, the current status and issues of genetic counseling in Japanese LLMs require further investigation. Additionally, to develop a dialogue system to support genetic counseling in the future, domain adaptation methods of LLMs should be explored, and expert data should be collected to assess the quality of LLM responses.

OBJECTIVE

This study aims to evaluate the current capabilities and identify obstacles in developing a dialogue system based on LLM for genetic counseling. The primary focus is to assess the effectiveness of domain adaptation methods within the context of genetic counseling. Furthermore, we will establish a dataset in which experts can evaluate responses generated by LLMs adapted with various domain adaptation methods to gather expert feedback for the future development of genetic counseling LLMs.

METHODS

Our study utilized two main datasets: (1) a question-answering (QA) dataset for LLM adaptation and (2) A genetic counseling question dataset for evaluation. The QA dataset comprised 899 pairs covering topics in medicine and genetic counseling, whereas the evaluation dataset comprised 120 refined questions across six genetic counseling categories. Three domain adaptation methods— instruction tuning, retrieval-augmented generation (RAG), and prompt engineering—were applied to a lightweight Japanese LLM. The performance of the adapted LLM was evaluated using a dataset of 120 carefully selected questions on genetic counseling. Two certified genetic counselors and one ophthalmologist assessed the responses generated by the LLM based on four key metrics: (1) inappropriateness of information, (2) sufficiency of information, (3) severity of harm, and (4) alignment with medical consensus.

RESULTS

The evaluation conducted by certified genetic counselors and ophthalmologist revealed varied outcomes across different domain adaptation methods. RAG demonstrated promising results, particularly in enhancing key aspects of genetic counseling. Conversely, instruction tuning and prompt engineering yielded less favorable outcomes. This evaluation process led to the construction of a dataset of expert-evaluated responses generated by LLMs adapted using various combinations of these methods. Error analysis highlighted critical ethical concerns, such as the inappropriate promotion of prenatal testing, criticism of relatives, and inaccurate probability statements.

CONCLUSIONS

RAG has significantly improved performance in all evaluation criteria, with the potential for further enhancement through the expansion of RAG data. Our expert-evaluated dataset offers valuable insights into future developments. However, the ethical issues identified in LLM responses underscore the importance of continued refinement and careful ethical considerations prior to the implementation of these systems in healthcare settings.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3