Utilizing Open-Source Language Models and ChatGPT for Zero-Shot Identification of Drug Discontinuation Events in Online Forums: Development and Validation Study (Preprint)-Reference-Cited by-同舟云学术

Utilizing Open-Source Language Models and ChatGPT for Zero-Shot Identification of Drug Discontinuation Events in Online Forums: Development and Validation Study (Preprint)

Published:2023-11-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Trevena William^ORCID,Zhong Xiang^ORCID,Alvarado Michelle^ORCID,Semenov Alexander^ORCID,Oktay Alp^ORCID,Devlin Devin^ORCID,Gohil Aarya^ORCID,Chittimouju Sai Harsha^ORCID

Abstract

BACKGROUND

The implementation of Transformer-based Natural Language Processing (NLP) systems, such as BERT and GPT-4, has revolutionized the extraction of insights from unstructured text. These advancements have expanded into healthcare, analyzing social media for public health insights. Yet, drug discontinuation events (DDEs) detection remains underexplored. Identifying DDEs is crucial for understanding medication adherence and patient outcomes.

OBJECTIVE

The objective of this study is to provide a flexible framework for investigating various clinical research questions in data-sparse environments. We exemplify the utility of this framework by identifying DDEs in an open-source online forum, medhelp.org, and by releasing the first open-source DDE datasets to aid further research in this domain.

METHODS

We used pre-trained Transformer-based Natural Language Processing (NLP) models, including ChatGPT, DeBERTa, BART, RoBERTa, DistilRoBERTa, and DistilBERT for zero-shot classification of user comments describing DDEs from medhelp.org.

RESULTS

Among the selected models, BART performed the best by achieving an F1 score of 0.86207, a false positive rate of 2.8%, and a false negative rate of 6.5% without any fine-tuning. The dataset comprised 10.7% DDEs, emphasizing the models’ robustness in an imbalanced data context.

CONCLUSIONS

Our study demonstrates the effectiveness of Transformer-based NLP models, such as ChatGPT and BART, for detecting DDEs from publicly accessible data through zero-shot classification. The robust and scalable framework we propose can aid researchers in addressing data-sparse clinical research questions. The release of open-access DDE datasets stands to stimulate further research and novel discoveries in this area.

Publisher

JMIR Publications Inc.

Reference25 articles.

1. A Primer in BERTology: What We Know About How BERT Works

2. Participatory Action Research

3. Neural network embeddings on corporate annual filings for portfolio selection

4. What do users think about Virtual Reality relaxation applications? A mixed methods study of online user reviews using natural language processing

5. Sentiment Analysis of Multilingual Tweets Based on Natural Language Processing (NLP)