Enhancing systematic reviews in orthodontics: a comparative examination of GPT-3.5 and GPT-4 for generating PICO-based queries with tailored prompts and configurations

Author:

Demir Gizem Boztaş1,Süküt Yağızalp1,Duran Gökhan Serhat1,Topsakal Kübra Gülnur1,Görgülü Serkan1

Affiliation:

1. Department of Orthodontics, Gulhane Faculty of Dentistry, University of Health Sciences , Ankara, Türkiye

Abstract

Summary Objectives The rapid advancement of Large Language Models (LLMs) has prompted an exploration of their efficacy in generating PICO-based (Patient, Intervention, Comparison, Outcome) queries, especially in the field of orthodontics. This study aimed to assess the usability of Large Language Models (LLMs), in aiding systematic review processes, with a specific focus on comparing the performance of ChatGPT 3.5 and ChatGPT 4 using a specialized prompt tailored for orthodontics. Materials/Methods Five databases were perused to curate a sample of 77 systematic reviews and meta-analyses published between 2016 and 2021. Utilizing prompt engineering techniques, the LLMs were directed to formulate PICO questions, Boolean queries, and relevant keywords. The outputs were subsequently evaluated for accuracy and consistency by independent researchers using three-point and six-point Likert scales. Furthermore, the PICO records of 41 studies, which were compatible with the PROSPERO records, were compared with the responses provided by the models. Results ChatGPT 3.5 and 4 showcased a consistent ability to craft PICO-based queries. Statistically significant differences in accuracy were observed in specific categories, with GPT-4 often outperforming GPT-3.5. Limitations The study’s test set might not encapsulate the full range of LLM application scenarios. Emphasis on specific question types may also not reflect the complete capabilities of the models. Conclusions/Implications Both ChatGPT 3.5 and 4 can be pivotal tools for generating PICO-driven queries in orthodontics when optimally configured. However, the precision required in medical research necessitates a judicious and critical evaluation of LLM-generated outputs, advocating for a circumspect integration into scientific investigations.

Publisher

Oxford University Press (OUP)

Reference25 articles.

1. Using artificial intelligence methods for systematic review in health sciences: a systematic review;Blaizot,2022

2. What is the current state of artificial intelligence applications in dentistry and orthodontics;Fawaz;Journal of Stomatology Oral Maxillofacial Surgery,2023

3. Development and accuracy of artificial intelligence-generated prediction of facial changes in orthodontic treatment: a scoping review;Zhu;Journal of Zhejiang University. Science. B.,2023

4. Understanding the capabilities, limitations, and societal impact of large language models;Tamkin,2021

5. Large language models are zero-shot reasoners;Kojima;Advances in Neural Information Processing Systems,2022

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. ChatGPT in orthodontics: limitations and possibilities;Australasian Orthodontic Journal;2024-07-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3