AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling

05/12/2022
by   Haoqin Tu, et al.
13

Variational Auto-Encoder (VAE) has become the de-facto learning paradigm in achieving both representation learning and generation for natural language. However, existing VAE-based language models either employ elementary RNNs, which is not powerful to handle complex situations, or fine-tunes two pre-trained language models (PLMs) for any downstream task, which is a huge drain on resources. In this paper, we introduce the first VAE framework empowered with adaptive GPT-2s (AdaVAE). Different from existing systems, we unify both the encoder&decoder of VAE model using GPT-2s with adaptive parameter-efficient components. Experiments from multiple dimensions validate that AdaVAE is competent to better organize language in generation task and representation modeling, even with less than 15% activated parameters in training. Our code is available at <https://github.com/ImKeTT/adavae>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2017

Multi-Stage Variational Auto-Encoders for Coarse-to-Fine Image Generation

Variational auto-encoder (VAE) is a powerful unsupervised learning frame...
research
07/08/2022

Hidden Schema Networks

Most modern language models infer representations that, albeit powerful,...
research
02/27/2017

Improved Variational Autoencoders for Text Modeling using Dilated Convolutions

Recent work on generative modeling of text has found that variational au...
research
03/14/2019

Diagnosing and Enhancing VAE Models

Although variational autoencoders (VAEs) represent a widely influential ...
research
09/15/2020

A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation

This work studies the widely adopted ancestral sampling algorithms for a...
research
11/15/2022

An Overview on Controllable Text Generation via Variational Auto-Encoders

Recent advances in neural-based generative modeling have reignited the h...
research
05/03/2023

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

Large language models (LLMs) have demonstrated remarkable abilities in r...

Please sign up or login with your details

Forgot password? Click here to reset