Penalized weighted least-squares estimate for variable selection on correlated multiply imputed data

Author:

Li Yang1ORCID,Yang Haoyu2,Yu Haochen2,Huang Hanwen3,Shen Ye3ORCID

Affiliation:

1. Center for Applied Statistics and School of Statistics, Renmin University of China , Beijing , China

2. School of Statistics, Renmin University of China , Beijing , China

3. College of Public Health, University of Georgia , Athens, Georgia , USA

Abstract

Abstract Considering the inevitable correlation among different datasets within the same subject, we propose a framework of variable selection on multiply imputed data with penalized weighted least squares (PWLS–MI). The methodological development is motivated by an epidemiological study of A/H7N9 patients from Zhejiang province in China, where nearly half of the variables are not fully observed. Multiple imputation is commonly adopted as a missing data processing method. However, it generates correlations among imputed values within the same subject across datasets. Recent work on variable selection for multiply imputed data does not fully address such similarities. We propose PWLS–MI to incorporate the correlation when performing the variable selection. PWLS–MI can be considered as a framework for variable selection on multiply imputed data since it allows various penalties. We use adaptive LASSO as an illustrating example. Extensive simulation studies are conducted to compare PWLS–MI with recently developed methods and the results suggest that the proposed approach outperforms in terms of both selection accuracy and deletion accuracy. PWLS–MI is shown to select variables with clinical relevance when applied to the A/H7N9 database.

Funder

National Natural Science Foundation of China

Renmin University of China

MOE Project of Key Research Institute of Humanities and Social Sciences

Publisher

Oxford University Press (OUP)

Subject

Statistics, Probability and Uncertainty,Statistics and Probability

Reference30 articles.

1. Multiple imputation by chained equations: What is it and how does it work?;Azur;International Journal of Methods in Psychiatric Research,2011

2. C-reactive protein is an independent predictor of severity in community-acquired pneumonia;Chalmers;The American Journal of Medicine,2008

3. Variable selection for multiply-imputed data with application to dioxin exposure study;Chen;Statistics in Medicine,2013

4. Variable selection via nonconcave penalized likelihood and its oracle properties;Fan;Journal of the American statistical Association,2001

5. C-reactive protein as a marker of melanoma progression;Fang;Journal of Clinical Oncology,2015

全球学者库

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"全球学者库"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前全球学者库共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2023 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3