Binary classification with fuzzy logistic regression under class imbalance and complete separation in clinical studies

Author:

Charizanos Georgios,Demirhan Haydar,İçen Duygu

Abstract

Abstract Background In binary classification for clinical studies, an imbalanced distribution of cases to classes and an extreme association level between the binary dependent variable and a subset of independent variables can create significant classification problems. These crucial issues, namely class imbalance and complete separation, lead to classification inaccuracy and biased results in clinical studies. Method To deal with class imbalance and complete separation problems, we propose using a fuzzy logistic regression framework for binary classification. Fuzzy logistic regression incorporates combinations of triangular fuzzy numbers for the coefficients, inputs, and outputs and produces crisp classification results. The fuzzy logistic regression framework shows strong classification performance due to fuzzy logic’s better handling of imbalance and separation issues. Hence, classification accuracy is improved, mitigating the risk of misclassified conditions and biased insights for clinical study patients. Results The performance of the fuzzy logistic regression model is assessed on twelve binary classification problems with clinical datasets. The model has consistently high sensitivity, specificity, F1, precision, and Mathew’s correlation coefficient scores across all clinical datasets. There is no evidence of impact from the imbalance or separation that exists in the datasets. Furthermore, we compare the fuzzy logistic regression classification performance against two versions of classical logistic regression and six different benchmark sources in the literature. These six sources provide a total of ten different proposed methodologies, and the comparison occurs by calculating the same set of classification performance scores for each method. Either imbalance or separation impacts seven out of ten methodologies. The remaining three produce better classification performance in their respective clinical studies. However, these are all outperformed by the fuzzy logistic regression framework. Conclusion Fuzzy logistic regression showcases strong performance against imbalance and separation, providing accurate predictions and, hence, informative insights for classifying patients in clinical studies.

Publisher

Springer Science and Business Media LLC

Reference44 articles.

1. Hall GC, Lanes S, Bollaerts K, Zhou X, Ferreira G, Gini R. Outcome misclassification: impact, usual practice in pharmacoepidemiology database studies and an online aid to correct biased estimates of risk ratio or cumulative incidence. Pharmacoepidemiol Drug Saf. 2020;29(11):1450–5.

2. AlKahya MA, Alreahan HO, Algamal ZY. Classication of Breast Cancer Histopathological Images using Adaptive Penalized Logistic Regression with Wilcoxon Rank Sum Test. Electron J Appl Stat Anal. 2023;16(3):507–18.

3. Itoo F, Meenakshi, Singh S. Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection. Int J Inform Technol. 2021;13:1503–11.

4. Luque A, Carrasco A, Martín A, de Las Heras A. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recog. 2019;91:216–31.

5. Rahman MS, Sultana M. Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data. BMC Med Res Methodol. 2017;17:1–15.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3