Sentence Bottleneck Autoencoders from Transformer Language Models

08/31/2021
by   Ivan Montero, et al.
0

Representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building NLP systems. This approach stands in contrast to autoencoders, also trained on raw text, but with the objective of learning to encode each input as a vector that allows full reconstruction. Autoencoders are attractive because of their latent space structure and generative properties. We therefore explore the construction of a sentence-level autoencoder from a pretrained, frozen transformer language model. We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder. We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer (an example of controlled generation), and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2020

Discovering Useful Sentence Representations from Large Pretrained Language Models

Despite the extensive success of pretrained language models as encoders ...
research
08/05/2021

Finetuning Pretrained Transformers into Variational Autoencoders

Text variational autoencoders (VAEs) are notorious for posterior collaps...
research
02/13/2014

Squeezing bottlenecks: exploring the limits of autoencoder semantic representation capabilities

We present a comprehensive study on the use of autoencoders for modellin...
research
12/12/2019

Shaping representations through communication: community size effect in artificial learning systems

Motivated by theories of language and communication that explain why com...
research
08/11/2022

MILAN: Masked Image Pretraining on Language Assisted Representation

Self-attention based transformer models have been dominating many comput...
research
01/17/2022

Language Model-Based Paired Variational Autoencoders for Robotic Language Learning

Human infants learn language while interacting with their environment in...
research
12/13/2021

Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Given the fact of a case, Legal Judgment Prediction (LJP) involves a ser...

Please sign up or login with your details

Forgot password? Click here to reset