Evolution Is All You Need: Phylogenetic Augmentation for Contrastive Learning

12/25/2020
by   Amy X. Lu, et al.
0

Self-supervised representation learning of biological sequence embeddings alleviates computational resource constraints on downstream tasks while circumventing expensive experimental label acquisition. However, existing methods mostly borrow directly from large language models designed for NLP, rather than with bioinformatics philosophies in mind. Recently, contrastive mutual information maximization methods have achieved state-of-the-art representations for ImageNet. In this perspective piece, we discuss how viewing evolution as natural sequence augmentation and maximizing information across phylogenetic "noisy channels" is a biologically and theoretically desirable objective for pretraining encoders. We first provide a review of current contrastive learning literature, then provide an illustrative example where we show that contrastive learning using evolutionary augmentation can be used as a representation learning objective which maximizes the mutual information between biological sequences and their conserved function, and finally outline rationale for this approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2019

A Mutual Information Maximization Perspective of Language Representation Learning

We show state-of-the-art word representation learning methods maximize a...
research
06/04/2020

Info3D: Representation Learning on 3D Objects using Mutual Information Maximization and Contrastive Learning

A major endeavor of computer vision is to represent, understand and extr...
research
10/27/2020

Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning

Self-supervised visual pretraining has shown significant progress recent...
research
06/03/2019

Learning Representations by Maximizing Mutual Information Across Views

We propose an approach to self-supervised representation learning based ...
research
08/04/2020

LoCo: Local Contrastive Representation Learning

Deep neural nets typically perform end-to-end backpropagation to learn t...
research
10/07/2020

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

We propose Deep Autoencoding Predictive Components (DAPC) – a self-super...
research
01/27/2023

Leveraging the Third Dimension in Contrastive Learning

Self-Supervised Learning (SSL) methods operate on unlabeled data to lear...

Please sign up or login with your details

Forgot password? Click here to reset