Using machine learning to improve anaphylaxis case identification in medical claims data-Reference-Cited by-同舟云学术

Using machine learning to improve anaphylaxis case identification in medical claims data

Published:2023-10-04 Issue:4 Volume:6 Page:
ISSN:2574-2531
Container-title:JAMIA Open
language:en
Short-container-title:

Author:

Kural Kamil Can¹²^ORCID,Mazo Ilya¹,Walderhaug Mark¹,Santana-Quintero Luis¹,Karagiannis Konstantinos¹,Thompson Elaine E¹,Kelman Jeffrey A³,Goud Ravi¹

Affiliation:

1. Center for Biologics Evaluation and Research (CBER), Food and Drug Administration , Silver Spring, MD 20993, United States

2. School of Systems Biology, George Mason University , Manassas, VA 20110, United States

3. Centers for Medicare & Medicaid Services , Washington, DC 20001, United States

Abstract

Abstract Objective Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of “Big Data” for healthcare or public health purposes. Methods This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases. Results Resulting machine learning model accuracies ranged between 47.7% and 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms. Discussion Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm. Conclusion Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.

Funder

FDA

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Link

https://academic.oup.com/jamiaopen/article-pdf/6/4/ooad090/52634160/ooad090.pdf

Reference27 articles.

1. Anaphylaxis—a practice parameter update 2015;Lieberman;Ann Allergy Asthma Immunol,2015

2. Anaphylaxis: underdiagnosed, underreported, and undertreated;Sclar;Am J Med,2014

3. A clinical practice guideline for the emergency management of anaphylaxis (2020);Li;Front Pharmacol,2022