Natural family-free genomic distance

07/07/2020
by   Diego P Rubert, et al.
0

A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments into families, more recently an alternative family-free approach was proposed, and consists of studying the rearrangement distances without prior family assignment. On the one hand the computation of genomic distances in the family-free setting helps to match occurrences of duplicated genes and find homologies, but on the other hand this computation is NP-hard. In this paper, by letting structural rearrangements be represented by the generic double cut and join (DCJ) operation and also allowing insertions and deletions of DNA segments, we propose a new and more general family-free genomic distance, providing an efficient ILP formulation to solve it. Our experiments show that the ILP produces accurate results and can handle not only bacterial genomes, but also fungi and insects, or subsets of chromosomes of mammals and plants.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

Investigating the complexity of the double distance problems

Two genomes over the same set of gene families form a canonical pair whe...
research
09/27/2019

Computing the Inversion-Indel Distance

The inversion distance, that is the distance between two unichromosomal ...
research
01/07/2020

Computing the rearrangement distance of natural genomes

The computation of genomic distances has been a very active field of com...
research
06/12/2019

The Tandem Duplication Distance is NP-hard

In computational biology, tandem duplication is an important biological ...
research
03/29/2019

Private Shotgun DNA Sequencing: A Structured Approach

Current techniques in sequencing a genome allow a service provider (e.g....
research
05/18/2017

Exemplar or Matching: Modeling DCJ Problems with Unequal Content Genome Data

The edit distance under the DCJ model can be computed in linear time for...
research
02/21/2018

A framework for cost-constrained genome rearrangement under Double Cut and Join

The study of genome rearrangement has many flavours, but they all are so...

Please sign up or login with your details

Forgot password? Click here to reset