A Statistical Perspective on the Challenges in Molecular Microbial Biology

03/06/2021
by   Pratheepa Jeganathan, et al.
0

High throughput sequencing (HTS)-based technology enables identifying and quantifying non-culturable microbial organisms in all environments. Microbial sequences have enhanced our understanding of the human microbiome, the soil and plant environment, and the marine environment. All molecular microbial data pose statistical challenges due to contamination sequences from reagents, batch effects, unequal sampling, and undetected taxa. Technical biases and heteroscedasticity have the strongest effects, but different strains across subjects and environments also make direct differential abundance testing unwieldy. We provide an introduction to a few statistical tools that can overcome some of these difficulties and demonstrate those tools on an example. We show how standard statistical methods, such as simple hierarchical mixture and topic models, can facilitate inferences on latent microbial communities. We also review some nonparametric Bayesian approaches that combine visualization and uncertainty quantification. The intersection of molecular microbial biology and statistics is an exciting new venue. Finally, we list some of the important open problems that would benefit from more careful statistical method development.

READ FULL TEXT

page 10

page 17

page 28

page 30

page 31

research
10/13/2021

High-throughput Phenotyping of Nematode Cysts

The beet cyst nematode (BCN) Heterodera schachtii is a plant pest respon...
research
06/30/2018

Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities

New technologies have enabled the investigation of biology and human hea...
research
10/17/2022

Statistics of High-Throughput Characterization of Microbial Interactions

An active area of research interest is the inference of ecological model...
research
07/01/2020

Data-driven Uncertainty Quantification for Systematic Coarse-grained Models

In this work, we present methodologies for the quantification of confide...
research
05/20/2020

Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

Uncertainty quantification (UQ) is an important component of molecular p...
research
03/19/2019

Uncertainty Quantification in Multivariate Mixed Models for Mass Cytometry Data

Mass cytometry technology enables the simultaneous measurement of over 4...
research
10/30/2018

Proofs of life: molecular-biology reasoning simulates cell behaviors from first principles

Science relies on external correctness: statistical analysis and reprodu...

Please sign up or login with your details

Forgot password? Click here to reset