Discovering Useful Sentence Representations from Large Pretrained Language Models

08/20/2020
by   Nishant Subramani, et al.
0

Despite the extensive success of pretrained language models as encoders for building NLP systems, they haven't seen prominence as decoders for sequence generation tasks. We explore the question of whether these models can be adapted to be used as universal decoders. To be considered "universal," a decoder must have an implicit representation for any target sentence s, such that it can recover that sentence exactly when conditioned on its representation. For large transformer-based language models trained on vast amounts of English text, we investigate whether such representations can be easily discovered using standard optimization methods. We present and compare three representation injection techniques for transformer-based models and three accompanying methods which map sentences to and from this representation space. Experiments show that not only do representations exist for sentences from a variety of genres. More importantly, without needing complex optimization algorithms, our methods recover these sentences almost perfectly without fine-tuning the underlying language model at all.

READ FULL TEXT

page 2

page 6

research
07/10/2019

Can Unconditional Language Models Recover Arbitrary Sentences?

Neural network-based generative language models like ELMo and BERT can w...
research
08/31/2021

Sentence Bottleneck Autoencoders from Transformer Language Models

Representation learning for text via pretraining a language model on a l...
research
12/20/2022

Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion

Transformer-based language models have shown strong performance on an ar...
research
07/04/2023

The Inner Sentiments of a Thought

Transformer-based large-scale language models (LLMs) are able to generat...
research
05/10/2022

Extracting Latent Steering Vectors from Pretrained Language Models

Prior work on controllable text generation has focused on learning how t...
research
03/16/2022

CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals

We propose a framework to modularize the training of neural language mod...
research
05/24/2022

Garden-Path Traversal within GPT-2

In recent years, massive language models consisting exclusively of trans...

Please sign up or login with your details

Forgot password? Click here to reset