Log In Sign Up

Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet

by   Mahir Onat Topal, et al.

Recent years have seen a proliferation of attention mechanisms and the rise of Transformers in Natural Language Generation (NLG). Previously, state-of-the-art NLG architectures such as RNN and LSTM ran into vanishing gradient problems; as sentences grew larger, distance between positions remained linear, and sequential computation hindered parallelization since sentences were processed word by word. Transformers usher in a new era. In this paper, we explore three major Transformer-based models, namely GPT, BERT, and XLNet, that carry significant implications for the field. NLG is a burgeoning area that is now bolstered with rapid developments in attention mechanisms. From poetry generation to summarization, text generation derives benefit as Transformer-based language models achieve groundbreaking results.


page 1

page 2

page 3


Language Models with Transformers

The Transformer architecture is superior to RNN-based models in computat...

Modern Methods for Text Generation

Synthetic text generation is challenging and has limited success. Recent...

Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?

The automatic detection of humor poses a grand challenge for natural lan...

Transformers with Competitive Ensembles of Independent Mechanisms

An important development in deep learning from the earliest MLPs has bee...

Stepwise Extractive Summarization and Planning with Structured Transformers

We propose encoder-centric stepwise models for extractive summarization ...

Plansformer: Generating Symbolic Plans using Transformers

Large Language Models (LLMs) have been the subject of active research, s...

Auto-tagging of Short Conversational Sentences using Natural Language Processing Methods

In this study, we aim to find a method to auto-tag sentences specific to...