Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?

08/26/2021
by   Kevin Blin, et al.
0

In this paper we address the problem of fine-tuned text generation with a limited computational budget. For that, we use a well-performing text generative adversarial network (GAN) architecture - Diversity-Promoting GAN (DPGAN), and attempted a drop-in replacement of the LSTM layer with a self-attention-based Transformer layer in order to leverage their efficiency. The resulting Self-Attention DPGAN (SADPGAN) was evaluated for performance, quality and diversity of generated text and stability. Computational experiments suggested that a transformer architecture is unable to drop-in replace the LSTM layer, under-performing during the pre-training phase and undergoing a complete mode collapse during the GAN tuning phase. Our results suggest that the transformer architecture need to be adapted before it can be used as a replacement for RNNs in text-generating GANs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2021

Keyword Transformer: A Self-Attention Model for Keyword Spotting

The Transformer architecture has been successful across many domains, in...
research
09/28/2018

SALSA-TEXT : self attentive latent space based adversarial text generation

Inspired by the success of self attention mechanism and Transformer arch...
research
06/14/2021

Improved Transformer for High-Resolution GANs

Attention-based models, exemplified by the Transformer, can effectively ...
research
06/25/2020

Empirical Analysis of Overfitting and Mode Drop in GAN Training

We examine two key questions in GAN training, namely overfitting and mod...
research
07/16/2023

Self-Attention Based Generative Adversarial Networks For Unsupervised Video Summarization

In this paper, we study the problem of producing a comprehensive video s...
research
10/25/2021

STransGAN: An Empirical Study on Transformer in GANs

Transformer becomes prevalent in computer vision, especially for high-le...
research
06/07/2020

Realistic text replacement with non-uniform style conditioning

In this work, we study the possibility of realistic text replacement, th...

Please sign up or login with your details

Forgot password? Click here to reset