Abstract
Abstract
Background
Machine learning can effectively nominate novel genes for various research purposes in the laboratory. On a genome-wide scale, we implemented multiple databases and algorithms to predict and prioritize the human aging genes (PPHAGE).
Results
We fused data from 11 databases, and used Naïve Bayes classifier and positive unlabeled learning (PUL) methods, NB, Spy, and Rocchio-SVM, to rank human genes in respect with their implication in aging. The PUL methods enabled us to identify a list of negative (non-aging) genes to use alongside the seed (known age-related) genes in the ranking process. Comparison of the PUL algorithms revealed that none of the methods for identifying a negative sample were advantageous over other methods, and their simultaneous use in a form of fusion was critical for obtaining optimal results (PPHAGE is publicly available at https://cbb.ut.ac.ir/pphage).
Conclusion
We predict and prioritize over 3,000 candidate age-related genes in human, based on significant ranking scores. The identified candidate genes are associated with pathways, ontologies, and diseases that are linked to aging, such as cancer and diabetes. Our data offer a platform for future experimental research on the genetic and biological aspects of aging. Additionally, we demonstrate that fusion of PUL methods and data sources can be successfully used for aging and disease candidate gene prioritization.
Publisher
Springer Science and Business Media LLC
Reference99 articles.
1. Korf B, Rimoin D, O’Connor J, Pyeritz R. Nature and frequency of genetic disease. In: Rimoin D, O’Connor J, Pyeritz R, Korf B, editors. Principles and Practice of Medical Genetics. Amsterdam: Elsevier; 2008. pp. 49–51.
2. Yang P, et al. Positive-unlabeled learning for disease gene identification. Bioinformatics. 2012;28(20):2640–7.
3. Al-Turaiki IM, et al. Computational approaches for gene prediction: a comparative survey. Berlin: Springer; 2011.
4. Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56.
5. Piro RM, Di Cunto F. Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J. 2012;279(5):678–96.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献