Comparing copy-number profiles under multi-copy amplifications and deletions

02/26/2020
by   Garance Cordonnier, et al.
0

During cancer progression, malignant cells accumulate somatic mutations that can lead to genetic aberrations. In particular, evolutionary events akin to segmental duplications or deletions can alter the copy-number profile (CNP) of a set of genes in a genome. Our aim is to compute the evolutionary distance between two cells for which only CNPs are known. This asks for the minimum number of segmental amplifications and deletions to turn one CNP into another. This was recently formalized into a model where each event is assumed to alter a copy-number by 1 or -1, even though these events can affect large portions of a chromosome. We propose a general cost framework where an event can modify the copy-number of a gene by larger amounts. We show that any cost scheme that allows segmental deletions of arbitrary length makes computing the distance strongly NP-hard. We then devise a factor 2 approximation algorithm for the problem when copy-numbers are non-zero and provide an implementation called cnp2cnp. We evaluate our approach experimentally by reconstructing simulated cancer phylogenies from the pairwise distances inferred by cnp2cnp and compare it against two other alternatives, namely the MEDICC distance and the Euclidean distance. The experimental results show that our distance yields more accurate phylogenies on average than these alternatives if the given CNPs are error-free, but that the MEDICC distance is slightly more robust against error in the data. In all cases, our experiments show that either our approach or the MEDICC approach should preferred over the Euclidean distance.

READ FULL TEXT
research
02/12/2020

Genomic Problems Involving Copy Number Profiles: Complexity and Algorithms

Recently, due to the genomic sequence analysis in several types of cance...
research
05/29/2022

2-Dimensional Euclidean Preferences

A preference profile with m alternatives and n voters is 2-dimensional E...
research
07/16/2018

Note on minimal number of skewed unit cells for periodic distance calculation

How many copies of a parallelepiped are needed to ensure that for every ...
research
11/22/2020

Topological Data Analysis of copy number alterations in cancer

Identifying subgroups and properties of cancer biopsy samples is a cruci...
research
12/05/2019

Almost-monochromatic sets and the chromatic number of the plane

In a colouring of R^d a pair (S,s_0) with S⊆R^d and with s_0∈ S is almos...
research
01/18/2016

Zero-error dissimilarity based classifiers

We consider general non-Euclidean distance measures between real world o...
research
07/23/2021

Gain-loss-duplication models on a phylogeny: exact algorithms for computing the likelihood and its gradient

Gene gain-loss-duplication models are commonly based on continuous-time ...

Please sign up or login with your details

Forgot password? Click here to reset