Toward A Scalable Exploratory Framework for Complex High-Dimensional Phenomics Data

07/14/2017
by   Methun Kamruzzaman, et al.
0

Phenomics is an emerging branch of modern biology, which uses high throughput phenotyping tools to capture multiple environment and phenotypic trait measurements, at a massive scale. The resulting high dimensional data sets represent a treasure trove of information for providing an indepth understanding of how multiple factors interact and contribute to control the growth and behavior of different plant crop genotypes. However, computational tools that can parse through such high dimensional data sets and aid in extracting plausible hypothesis are currently lacking. In this paper, we present a new algorithmic approach to effectively decode and characterize the role of environment on phenotypic traits, from complex phenomic data. To the best of our knowledge, this effort represents the first application of topological data analysis on phenomics data. We applied this novel algorithmic approach on a real-world maize data set. Our results demonstrate the ability of our approach to delineate emergent behavior among subpopulations, as dictated by one or more environmental factors; notably, our approach shows how the environment plays a key role in determining the phenotypic behavior of one of the two genotypes. Source code for our implementation and test data are freely available with detailed instructions at https://xperthut.github.io/HYPPO-X

READ FULL TEXT

page 2

page 9

page 16

page 18

page 19

page 20

page 21

page 22

research
03/09/2021

Dory: Overcoming Barriers to Computing Persistent Homology

Persistent homology (PH) is an approach to topological data analysis (TD...
research
06/25/2021

Pheno-Mapper: An Interactive Toolbox for the Visual Exploration of Phenomics Data

High-throughput technologies to collect field data have made observation...
research
08/16/2019

Visualization of Very Large High-Dimensional Data Sets as Minimum Spanning Trees

Here, we introduce a new data visualization and exploration method, TMAP...
research
12/28/2016

Optimal bandwidth estimation for a fast manifold learning algorithm to detect circular structure in high-dimensional data

We provide a way to infer about existence of topological circularity in ...
research
08/17/2018

Estimating and accounting for unobserved covariates in high dimensional correlated data

Many high dimensional and high-throughput biological datasets have compl...
research
05/23/2022

A scalable and flexible Cox proportional hazards model for high-dimensional survival prediction and functional selection

Cox proportional hazards model is one of the most popular models in biom...

Please sign up or login with your details

Forgot password? Click here to reset