Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice Proteins

02/18/2019
by   Jérôme Tubiana, et al.
0

A Restricted Boltzmann Machine (RBM) is an unsupervised machine-learning bipartite graphical model that jointly learns a probability distribution over data and extracts their relevant statistical features. As such, RBM were recently proposed for characterizing the patterns of coevolution between amino acids in protein sequences and for designing new sequences. Here, we study how the nature of the features learned by RBM changes with its defining parameters, such as the dimensionality of the representations (size of the hidden layer) and the sparsity of the features. We show that for adequate values of these parameters, RBM operate in a so-called compositional phase in which visible configurations sampled from the RBM are obtained by recombining these features. We then compare the performance of RBM with other standard representation learning algorithms, including Principal or Independent Component Analysis, autoencoders (AE), variational auto-encoders (VAE), and their sparse variants. We show that RBM, due to the stochastic mapping between data configurations and representations, better capture the underlying interactions in the system and are significantly more robust with respect to sample size than deterministic methods such as PCA or ICA. In addition, this stochastic mapping is not prescribed a priori as in VAE, but learned from data, which allows RBM to show good performance even with shallow architectures. All numerical results are illustrated on synthetic lattice-protein data, that share similar statistical features with real protein sequences, and for which ground-truth interactions are known.

READ FULL TEXT

page 9

page 16

page 22

page 23

page 24

research
11/21/2016

Emergence of Compositional Representations in Restricted Boltzmann Machines

Extracting automatically the complex set of features composing real high...
research
01/15/2013

Discrete Restricted Boltzmann Machines

We describe discrete restricted Boltzmann machines: probabilistic graphi...
research
06/23/2022

Disentangling representations in Restricted Boltzmann Machines without adversaries

A goal of unsupervised machine learning is to disentangle representation...
research
12/09/2017

Variational auto-encoding of protein sequences

Proteins are responsible for the most diverse set of functions in biolog...
research
09/10/2019

Boltzmann machine learning and regularization methods for inferring evolutionary fields and couplings from a multiple sequence alignment

The inverse Potts problem to infer the Boltzmann distribution for homolo...
research
10/31/2012

Temporal Autoencoding Restricted Boltzmann Machine

Much work has been done refining and characterizing the receptive fields...
research
12/03/2020

Generative Capacity of Probabilistic Protein Sequence Models

Variational autoencoders (VAEs) have recently gained popularity as gener...

Please sign up or login with your details

Forgot password? Click here to reset