Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs

06/03/2020
by   David Schaller, et al.
0

Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the Reciprocal Best Match Graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the Best Match Graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. We characterize u-fp edges in terms of subgraphs of the BMG and show that, given a BMG, there is a unique "augmented tree" that explains the BMG and identifies all u-fp edges in terms of overlapping sets of species in certain subtrees. The augmented tree can be constructed as a refinement of the unique least resolved tree of the BMG in polynomial time. Removal of the u-fp edges from the reciprocal best matches results in a unique orthology assignment.

READ FULL TEXT
research
04/26/2019

Best Match Graphs and Reconciliation of Gene Trees with Species Trees

A wide variety of problems in computational biology, most notably the as...
research
11/01/2020

Best Match Graphs with Binary Trees

Best match graphs (BMG) are a key intermediate in graph-based orthology ...
research
12/05/2022

Relative Timing Information and Orthology in Evolutionary Scenarios

Evolutionary scenarios describing the evolution of a family of genes wit...
research
03/11/2021

Arc-Completion of 2-Colored Best Match Graphs to Binary-Explainable Best Match Graphs

Best match graphs (BMGs) are vertex-colored digraphs that naturally aris...
research
06/30/2020

Complexity of Modification Problems for Best Match Graphs

Best match graphs (BMGs) are vertex-colored directed graphs that were in...
research
01/18/2021

Least resolved trees for two-colored best match graphs

2-colored best match graphs (2-BMGs) form a subclass of sink-free bi-tra...
research
04/06/2020

SOPanG 2: online searching over a pan-genome without false positives

The pan-genome can be stored as elastic-degenerate (ED) string, a recent...

Please sign up or login with your details

Forgot password? Click here to reset