DeepAI AI Chat
Log In Sign Up

Graph-Conditioned MLP for High-Dimensional Tabular Biomedical Data

by   Andrei Margeloiu, et al.

Genome-wide studies leveraging recent high-throughput sequencing technologies collect high-dimensional data. However, they usually include small cohorts of patients, and the resulting tabular datasets suffer from the "curse of dimensionality". Training neural networks on such datasets is typically unstable, and the models overfit. One problem is that modern weight initialisation strategies make simplistic assumptions unsuitable for small-size datasets. We propose Graph-Conditioned MLP, a novel method to introduce priors on the parameters of an MLP. Instead of randomly initialising the first layer, we condition it directly on the training data. More specifically, we create a graph for each feature in the dataset (e.g., a gene), where each node represents a sample from the same dataset (e.g., a patient). We then use Graph Neural Networks (GNNs) to learn embeddings from these graphs and use the embeddings to initialise the MLP's parameters. Our approach opens the prospect of introducing additional biological knowledge when constructing the graphs. We present early results on 7 classification tasks from gene expression data and show that GC-MLP outperforms an MLP.


page 1

page 2

page 3

page 4


Using ontology embeddings for structural inductive bias in gene expression data analysis

Stratifying cancer patients based on their gene expression levels allows...

Predicting gene expression from network topology using graph neural networks

Motivation: It is known that the structure of transcription and protein ...

Graph Neural Networks with Precomputed Node Features

Most Graph Neural Networks (GNNs) cannot distinguish some graphs or inde...

Weight Predictor Network with Feature Selection for Small Sample Tabular Biomedical Data

Tabular biomedical data is often high-dimensional but with a very small ...

Vine dependence graphs with latent variables as summaries for gene expression data

The advent of high-throughput sequencing technologies has lead to vast c...

Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing

Recent research has shown that Graph Neural Networks (GNNs) can learn po...

GNisi: A graph network for reconstructing Ising models from multivariate binarized data

Ising models are a simple generative approach to describing interacting ...