Author:
Drożdż Karolina,Nabrdalik Katarzyna,Kwiendacz Hanna,Hendel Mirela,Olejarz Anna,Tomasik Andrzej,Bartman Wojciech,Nalepa Jakub,Gumprecht Janusz,Lip Gregory Y. H.
Abstract
Abstract
Background
Nonalcoholic fatty liver disease is associated with an increased cardiovascular disease (CVD) risk, although the exact mechanism(s) are less clear. Moreover, the relationship between newly redefined metabolic-associated fatty liver disease (MAFLD) and CVD risk has been poorly investigated. Data-driven machine learning (ML) techniques may be beneficial in discovering the most important risk factors for CVD in patients with MAFLD.
Methods
In this observational study, the patients with MAFLD underwent subclinical atherosclerosis assessment and blood biochemical analysis. Patients were split into two groups based on the presence of CVD (defined as at least one of the following: coronary artery disease; myocardial infarction; coronary bypass grafting; stroke; carotid stenosis; lower extremities artery stenosis).
The ML techniques were utilized to construct a model which could identify individuals with the highest risk of CVD. We exploited the multiple logistic regression classifier operating on the most discriminative patient’s parameters selected by univariate feature ranking or extracted using principal component analysis (PCA). Receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) were calculated for the investigated classifiers, and the optimal cut-point values were extracted from the ROC curves using the Youden index, the closest to (0, 1) criteria and the Index of Union methods.
Results
In 191 patients with MAFLD (mean age: 58, SD: 12 years; 46% female), there were 47 (25%) patients who had the history of CVD. The most important clinical variables included hypercholesterolemia, the plaque scores, and duration of diabetes. The five, ten and fifteen most discriminative parameters extracted using univariate feature ranking and utilized to fit the ML models resulted in AUC of 0.84 (95% confidence interval [CI]: 0.77–0.90, p < 0.0001), 0.86 (95% CI 0.80–0.91, p < 0.0001) and 0.87 (95% CI 0.82–0.92, p < 0.0001), whereas the classifier fitted over 10 principal components extracted using PCA followed by the parallel analysis obtained AUC of 0.86 (95% CI 0.81–0.91, p < 0.0001). The best model operating on 5 most discriminative features correctly identified 114/144 (79.17%) low-risk and 40/47 (85.11%) high-risk patients.
Conclusion
A ML approach demonstrated high performance in identifying MAFLD patients with prevalent CVD based on the easy-to-obtain patient parameters.
Funder
Medical University of Silesia
Silesia University of Technology
Publisher
Springer Science and Business Media LLC
Subject
Cardiology and Cardiovascular Medicine,Endocrinology, Diabetes and Metabolism
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献