Abstract
Recently, research on detecting SNP interactions has attracted considerable attention, which is of great significance for exploring complex diseases. The formulation of effective swarm intelligence optimization algorithms is a primary resolution to this issue. To achieve this goal, an important problem needs to be solved in advance; that is, designing and selecting lightweight scoring criteria that can be calculated in O(m) time and can accurately estimate the degree of association between SNP combinations and disease status. In this study, we propose a high-accuracy scoring criterion (HSICCR) by measuring the degree of causality dedicated to assessing the degree. First, we approximate two kinds of dependencies according to the structural equation of the causal relationship between epistasis SNP combination and disease status. Then, inspired by these dependencies, we put forward this scoring criterion that integrates a widely used method of measuring statistical dependencies based on kernel functions (HSIC). However, the computing time complexity of HSIC is O(m2), which is too costly to be an integral part of the scoring criterion. Since the sizes of the sample space of the disease status, SNP loci and SNP combination are small enough, we propose an efficient method of computing HSIC for variables with a small sample in O(m) time. Eventually, HSICCR can be computed in O(m) time in practice. Finally, we compared HSICCR with five representative high-accuracy scoring criteria that detect SNP interactions for 49 simulation disease models. The experimental results show that the accuracy of our proposed scoring criterion is, overall, state-of-the-art.
Funder
Guangdong provincial medical research foundation of China
national natural science foundation of China
natural science foundation of Guangdong province, China
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference33 articles.
1. Mapping complex disease loci in whole-genome association studies;Carlson;Nature,2004
2. Detecting epistasis in human complex traits;Wei;Nat. Rev. Genet.,2014
3. Guo, X., Meng, Y., Yu, N., and Pan, Y. Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform., 2014. 15.
4. Searching genome-wide multi-locus associations for multiple diseases based on bayesian inference;Guo;IEEE/ACM Trans. Comput. Biol. Bioinform.,2016
5. High-throughput analysis of epistasis in genome-wide association studies with BiForce;Gyenesei;Bioinformatics,2012