A Bayesian framework for molecular strain identification from mixed diagnostic samples

03/07/2018
by   Lauri Mustonen, et al.
0

We provide a mathematical formulation and develop a computational framework for identifying multiple strains of microorganisms from mixed samples of DNA. Our method is applicable in public health domains where efficient identification of pathogens is paramount, such as the monitoring of disease outbreaks. We formulate strain identification as an inverse problem that aims at simultaneously estimating a binary matrix (encoding presence or absence of mutations) and a real-valued vector (representing the mixture of strains) such that their product is approximately equal to the measured data vector. The problem has similar structure to blind deconvolution, except binary constraints are present in the formulation and enforced in our approach. Following a Bayesian approach, we derive a posterior density. We present two computational methods for solving the non-convex maximum a posteriori estimation problem. The first one is a local optimization method that is made efficient and scalable by decoupling the problem into smaller independent subproblems, whereas the second one yields a global minimizer by converting the problem into a convex mixed-integer quadratic programming problem. The decoupling approach also provides an efficient way to integrate over the posterior. This provides useful information about the ambiguity of the underdetermined problem and, thus, the uncertainty associated with numerical solutions. We evaluate the potential and limitations of our framework in silico using synthetic data with available ground truth.

READ FULL TEXT
research
11/08/2019

Maximum a-Posteriori Estimation for the Gaussian Mixture Model via Mixed Integer Nonlinear Programming

We present a global optimization approach for solving the classical maxi...
research
01/20/2020

Mixed integer programming formulation of unsupervised learning

A novel formulation and training procedure for full Boltzmann machines i...
research
01/02/2022

On the convex hull of convex quadratic optimization problems with indicators

We consider the convex quadratic optimization problem with indicator var...
research
02/28/2020

MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment

We present a convex mixed-integer programming formulation for non-rigid ...
research
11/19/2020

Data-Driven Robust Optimization using Unsupervised Deep Learning

Robust optimization has been established as a leading methodology to app...
research
01/22/2020

Optimal binning: mathematical programming formulation

The optimal binning is the optimal discretization of a variable into bin...
research
02/15/2023

Variable Selection for Kernel Two-Sample Tests

We consider the variable selection problem for two-sample tests, aiming ...

Please sign up or login with your details

Forgot password? Click here to reset