Colorful orthology clustering in bounded-degree similarity graphs-Reference-Cited by-同舟云学术

Colorful orthology clustering in bounded-degree similarity graphs

Published:2021-11-13 Issue:06 Volume:19 Page:
ISSN:0219-7200
Container-title:Journal of Bioinformatics and Computational Biology
language:en
Short-container-title:J. Bioinform. Comput. Biol.

Author:

Sánchez Alitzel López¹,Lafond Manuel¹

Affiliation:

1. Computer Science Department, Université de Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke, Québec J1K 2R1, Canada

Abstract

Clustering genes in similarity graphs is a popular approach for orthology prediction. Most algorithms group genes without considering their species, which results in clusters that contain several paralogous genes. Moreover, clustering is known to be problematic when in-paralogs arise from ancient duplications. Recently, we proposed a two-step process that avoids these problems. First, we infer clusters of only orthologs (i.e. with only genes from distinct species), and second, we infer the missing inter-cluster orthologs. In this paper, we focus on the first step, which leads to a problem we call Colorful Clustering . In general, this is as hard as classical clustering. However, in similarity graphs, the number of species is usually small, as well as the neighborhood size of genes in other species. We therefore study the problem of clustering in which the number of colors is bounded by [Formula: see text], and each gene has at most [Formula: see text] neighbors in another species. We show that the well-known cluster editing formulation remains NP-hard even when [Formula: see text] and [Formula: see text]. We then propose a fixed-parameter algorithm in [Formula: see text] to find the single best cluster in the graph. We implemented this algorithm and included it in the aforementioned two-step approach. Experiments on simulated data show that this approach performs favorably to applying only an unconstrained clustering step.

Publisher

World Scientific Pub Co Pte Ltd

Subject

Computer Science Applications,Molecular Biology,Biochemistry

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219720021400102

Reference56 articles.

1. Orthologs, Paralogs, and Evolutionary Genomics

2. Functional and evolutionary implications of gene orthology

3. How confident can we be that orthologs are similar, but paralogs differ?

4. Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals

5. Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Introduction to the Special Issue of the 18th Annual International RECOMB Satellite Workshop on Comparative Genomics;Journal of Bioinformatics and Computational Biology;2021-12