Graph-Conditioned MLP for High-Dimensional Tabular Biomedical Data

11/11/2022
by   Andrei Margeloiu, et al.
0

Genome-wide studies leveraging recent high-throughput sequencing technologies collect high-dimensional data. However, they usually include small cohorts of patients, and the resulting tabular datasets suffer from the "curse of dimensionality". Training neural networks on such datasets is typically unstable, and the models overfit. One problem is that modern weight initialisation strategies make simplistic assumptions unsuitable for small-size datasets. We propose Graph-Conditioned MLP, a novel method to introduce priors on the parameters of an MLP. Instead of randomly initialising the first layer, we condition it directly on the training data. More specifically, we create a graph for each feature in the dataset (e.g., a gene), where each node represents a sample from the same dataset (e.g., a patient). We then use Graph Neural Networks (GNNs) to learn embeddings from these graphs and use the embeddings to initialise the MLP's parameters. Our approach opens the prospect of introducing additional biological knowledge when constructing the graphs. We present early results on 7 classification tasks from gene expression data and show that GC-MLP outperforms an MLP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2020

Using ontology embeddings for structural inductive bias in gene expression data analysis

Stratifying cancer patients based on their gene expression levels allows...
research
05/08/2020

Predicting gene expression from network topology using graph neural networks

Motivation: It is known that the structure of transcription and protein ...
research
06/01/2022

Graph Neural Networks with Precomputed Node Features

Most Graph Neural Networks (GNNs) cannot distinguish some graphs or inde...
research
08/19/2023

Geometric instability of graph neural networks on large graphs

We analyse the geometric instability of embeddings produced by graph neu...
research
03/02/2023

Vine dependence graphs with latent variables as summaries for gene expression data

The advent of high-throughput sequencing technologies has lead to vast c...
research
07/04/2023

A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks

Graph neural networks (GNNs) have become increasingly popular for classi...
research
09/09/2021

GNisi: A graph network for reconstructing Ising models from multivariate binarized data

Ising models are a simple generative approach to describing interacting ...

Please sign up or login with your details

Forgot password? Click here to reset