Investigate the Value of the Random Forest Algorithm in Assessing Risk factors for Chronic Kidney Disease (Preprint)

Author:

Liu Pei,Liu Yijun,Liu Hao,Xiong Linping,Mei Changlin,Yuan Lei

Abstract

BACKGROUND

Chronic Kidney Disease (CKD) is a chronic structural and functional disorder of the kidney caused by various causes, and it is a major global health concern, with studies suggesting an average annual increase of 3.4% in the mortality rate caused by CKD from 1990 to 2015, and a current global prevalence of 14.3%, the mortality rate of CKD is expected to be about 14 deaths per 100,000 by 2030. In addition, the economic burden of CKD represents 31.4% of the global annual burden of living with disability and is continuously growing at 1% per year. In China, the prevalence of CKD among people over 18 years old is 10.8%, with approximately 120 million patients, or 1 out of every 10 people. In Shanghai, the prevalence is even higher at 11.8%, or 1 in every 8-9 people, and only 12.5% of patients are aware of their disease.

OBJECTIVE

The aim of this study was to investigate the value of the Random Forest algorithm (RF) for assessing risk factors associated with chronic kidney disease (CKD).

METHODS

A population of 40,686 individuals with CKD was identified from those who underwent screening between 1 January 2015 and 22 December 2020 in Jing'an District, Shanghai, China. We divided CKD individuals into those requiring management and those who did not, based on GFR staging and albuminuria grouping. Using a logistic regression model (LR), we analyzed the relationship between CKD and risk factors. The RF algorithm in machine learning was used to score the predictive variables and rank them according to their importance, to construct a prediction model.

RESULTS

The LR model implied that women had a lower risk of CKD than men; the risk of CKD increased with age; CKD risk was higher in individuals whose BMI exceeded the normal range; those with abnormal eGFR index status had a higher risk for CKD. Furthermore, those who were retired had a higher risk for CKD than others, and those with urban employees' medical insurance had a higher risk for CKD than those with other medical insurances. According to the RF model, the order of risk factors for CKD was as follows: age, albuminuria, occupation, urinary albumin creatinine ratio, type of health insurance, eGFR index, urinary routine protein index, BMI, gender, history of hypertension, and blood creatinine index.

CONCLUSIONS

Our conclusions suggest that the RF algorithm has significant predictive value for assessing risk factors associated with CKD. Moreover, older age, abnormal urine biomarkers, and BMI were identified as primary risk factors for CKD. The RF algorithm has the benefits of high accuracy, stability, and easy operation. Additionally, it avoids overlearning in classification and prediction.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3