PANTHER: Pathway Augmented Nonnegative Tensor factorization for HighER-order feature learning

by   Yuan Luo, et al.

Genetic pathways usually encode molecular mechanisms that can inform targeted interventions. It is often challenging for existing machine learning approaches to jointly model genetic pathways (higher-order features) and variants (atomic features), and present to clinicians interpretable models. In order to build more accurate and better interpretable machine learning models for genetic medicine, we introduce Pathway Augmented Nonnegative Tensor factorization for HighER-order feature learning (PANTHER). PANTHER selects informative genetic pathways that directly encode molecular mechanisms. We apply genetically motivated constrained tensor factorization to group pathways in a way that reflects molecular mechanism interactions. We then train a softmax classifier for disease types using the identified pathway groups. We evaluated PANTHER against multiple state-of-the-art constrained tensor/matrix factorization models, as well as group guided and Bayesian hierarchical models. PANTHER outperforms all state-of-the-art comparison models significantly (p<0.05). Our experiments on large scale Next Generation Sequencing (NGS) and whole-genome genotyping datasets also demonstrated wide applicability of PANTHER. We performed feature analysis in predicting disease types, which suggested insights and benefits of the identified pathway groups.



There are no comments yet.


page 3


HyperNTF: A Hypergraph Regularized Nonnegative Tensor Factorization for Dimensionality Reduction

Most methods for dimensionality reduction are based on either tensor rep...

Efficient Constrained Tensor Factorization by Alternating Optimization with Primal-Dual Splitting

Tensor factorization with hard and/or soft constraints has played an imp...

Bayesian Semi-nonnegative Tri-matrix Factorization to Identify Pathways Associated with Cancer Types

Identifying altered pathways that are associated with specific cancer ty...

Integrating Hypertension Phenotype and Genotype with Hybrid Non-negative Matrix Factorization

Hypertension is a heterogeneous syndrome in need of improved subtyping u...

Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis

Efficient and interpretable spatial analysis is crucial in many fields s...

Time-varying Graph Representation Learning via Higher-Order Skip-Gram with Negative Sampling

Representation learning models for graphs are a successful family of tec...

Cancer classification and pathway discovery using non-negative matrix factorization

Extracting genetic information from a full range of sequencing data is i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.