Algorithmic approaches to selecting control clones in DNA array hybridization experiments

05/27/2020
by   Qi Fu, et al.
0

We study the problem of selecting control clones in DNA array hybridization experiments. The problem arises in the OFRG method for analyzing microbial communities. The OFRG method performs classification of rRNA gene clones using binary fingerprints created from a series of hybridization experiments, where each experiment consists of hybridizing a collection of arrayed clones with a single oligonucleotide probe. This experiment produces analog signals, one for each clone, which then need to be classified, that is, converted into binary values 1 and 0 that represent hybridization and non-hybridization events. In addition to the sample rRNA gene clones, the array contains a number of control clones needed to calibrate the classification procedure of the hybridization signals. These control clones must be selected with care to optimize the classification process. We formulate this as a combinatorial optimization problem called Balanced Covering. We prove that the problem is NP-hard, and we show some results on hardness of approximation. We propose approximation algorithms based on randomized rounding and we show that, with high probability, our algorithms approximate well the optimum solution. The experimental results confirm that the algorithms find high quality control clones. The algorithms have been implemented and are publicly available as part of the software package called CloneTools.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2017

New Covering Array Numbers

A covering array CA(N; t; k; v) is an N x k array on v symbols such that...
research
12/15/2020

SimpleChrome: Encoding of Combinatorial Effects for Predicting Gene Expression

Due to recent breakthroughs in state-of-the-art DNA sequencing technolog...
research
06/29/2018

Definable Inapproximability: New Challenges for Duplicator

We consider the hardness of approximation of optimization problems from ...
research
11/06/2022

A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization

Influence Maximization (IM) is a classical combinatorial optimization pr...
research
07/07/2020

Approximate Search for Known Gene Clusters in New Genomes Using PQ-Trees

We define a new problem in comparative genomics, denoted PQ-Tree Search,...
research
09/26/2019

Approximation Algorithms for Process Systems Engineering

Designing and analyzing algorithms with provable performance guarantees ...
research
05/05/2021

Comparative Analysis of Box-Covering Algorithms for Fractal Networks

Research on fractal networks is a dynamically growing field of network s...

Please sign up or login with your details

Forgot password? Click here to reset