Affiliation:
1. School of Statistics and Data Science Nankai University Tianjin 300071 People's Republic of China
2. College of Mathematics and System Science Xinjiang University Urumqi 830046 Xinjiang People's Republic of China
3. School of Data Science and Artificial Intelligence Dongbei University of Finance and Economics Dalian 116025 Liaoning People's Republic of China
4. School of Mathematics and Statistics Ningxia University Yinchuan 750021 Ningxia People's Republic of China
Abstract
SummaryMicrobiome data typically lie in a high‐dimensional simplex. One of the key questions in metagenomic analysis is to exploit the covariance structure for this kind of data. In this paper, a framework called approximate‐estimate‐threshold (AET) is developed for the robust basis covariance estimation for high‐dimensional microbiome data. To be specific, we first construct a proxy matrix , which is almost indistinguishable from the real basis covariance matrix . Then, any estimator satisfying some conditions can be used to estimate . Finally, we impose a thresholding step on to obtain the final estimator . In particular, this paper applies a Huber‐type estimator , and achieves robustness by only requiring the boundedness of 2+ moments for some . We derive the convergence rate of under the spectral norm, and provide theoretical guarantees on support recovery. Extensive simulations and a real example are used to illustrate the empirical performance of our method.
Funder
National Natural Science Foundation of China