Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

04/05/2020
by   Chunyuan Li, et al.
1

When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, Optimus enables guided language generation from an abstract level using the latent vectors. Compared with BERT, Optimus can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of Optimus. It achieves new state-of-the-art on VAE language modeling benchmarks. We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.

READ FULL TEXT
research
09/13/2021

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

In this paper, we take the advantage of previous pre-trained models (PTM...
research
01/04/2021

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

We investigate large-scale latent variable models (LVMs) for neural stor...
research
09/01/2021

OptAGAN: Entropy-based finetuning on text VAE-GAN

Transfer learning through large pre-trained models has changed the lands...
research
08/29/2021

Variational voxelwise rs-fMRI representation learning: Evaluation of sex, age, and neuropsychiatric signatures

We propose to apply non-linear representation learning to voxelwise rs-f...
research
07/08/2022

Hidden Schema Networks

Most modern language models infer representations that, albeit powerful,...
research
09/12/2018

Hyperprior Induced Unsupervised Disentanglement of Latent Representations

We address the problem of unsupervised disentanglement of latent represe...
research
07/03/2020

Generative Modeling for Atmospheric Convection

To improve climate modeling, we need a better understanding of multi-sca...

Please sign up or login with your details

Forgot password? Click here to reset