DeepAI
Log In Sign Up

Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?

08/26/2021
by   Kevin Blin, et al.
0

In this paper we address the problem of fine-tuned text generation with a limited computational budget. For that, we use a well-performing text generative adversarial network (GAN) architecture - Diversity-Promoting GAN (DPGAN), and attempted a drop-in replacement of the LSTM layer with a self-attention-based Transformer layer in order to leverage their efficiency. The resulting Self-Attention DPGAN (SADPGAN) was evaluated for performance, quality and diversity of generated text and stability. Computational experiments suggested that a transformer architecture is unable to drop-in replace the LSTM layer, under-performing during the pre-training phase and undergoing a complete mode collapse during the GAN tuning phase. Our results suggest that the transformer architecture need to be adapted before it can be used as a replacement for RNNs in text-generating GANs.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/01/2021

Keyword Transformer: A Self-Attention Model for Keyword Spotting

The Transformer architecture has been successful across many domains, in...
09/28/2018

SALSA-TEXT : self attentive latent space based adversarial text generation

Inspired by the success of self attention mechanism and Transformer arch...
06/14/2021

Improved Transformer for High-Resolution GANs

Attention-based models, exemplified by the Transformer, can effectively ...
06/25/2020

Empirical Analysis of Overfitting and Mode Drop in GAN Training

We examine two key questions in GAN training, namely overfitting and mod...
03/05/2018

ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing

We address the problem of finding realistic geometric corrections to a f...
04/27/2021

Text Generation with Deep Variational GAN

Generating realistic sequences is a central task in many machine learnin...
11/14/2022

Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention

Recently, powerful Transformer architectures have proven superior in gen...