Learning interpretable models of phenotypes from whole genome sequences with the Set Covering Machine

12/02/2014
by   Alexandre Drouin, et al.
0

The increased affordability of whole genome sequencing has motivated its use for phenotypic studies. We address the problem of learning interpretable models for discrete phenotypes from whole genomes. We propose a general approach that relies on the Set Covering Machine and a k-mer representation of the genomes. We show results for the problem of predicting the resistance of Pseudomonas Aeruginosa, an important human pathogen, against 4 antibiotics. Our results demonstrate that extremely sparse models which are biologically relevant can be learnt using this approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2018

Techniques for Interpretable Machine Learning

Interpretable machine learning tackles the important problem that humans...
research
05/23/2016

Genetic Architect: Discovering Genomic Structure with Learned Neural Architectures

Each human genome is a 3 billion base pair set of encoding instructions....
research
06/07/2023

Invariant Causal Set Covering Machines

Rule-based models, such as decision trees, appeal to practitioners due t...
research
07/11/2022

Almost optimum ℓ-covering of ℤ_n

A subset B of ring ℤ_n is called a ℓ-covering set if { ab n | 0≤ a ≤ℓ, ...
research
05/22/2015

Greedy Biomarker Discovery in the Genome with Applications to Antimicrobial Resistance

The Set Covering Machine (SCM) is a greedy learning algorithm that produ...
research
05/27/2019

Ancestral causal learning in high dimensions with a human genome-wide application

We consider learning ancestral causal relationships in high dimensions. ...
research
05/22/2018

copMEM: Finding maximal exact matches via sampling both genomes

Genome-to-genome comparisons require designating anchor points, which ar...

Please sign up or login with your details

Forgot password? Click here to reset