Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet

02/16/2021
by   Mahir Onat Topal, et al.
24

Recent years have seen a proliferation of attention mechanisms and the rise of Transformers in Natural Language Generation (NLG). Previously, state-of-the-art NLG architectures such as RNN and LSTM ran into vanishing gradient problems; as sentences grew larger, distance between positions remained linear, and sequential computation hindered parallelization since sentences were processed word by word. Transformers usher in a new era. In this paper, we explore three major Transformer-based models, namely GPT, BERT, and XLNet, that carry significant implications for the field. NLG is a burgeoning area that is now bolstered with rapid developments in attention mechanisms. From poetry generation to summarization, text generation derives benefit as Transformer-based language models achieve groundbreaking results.

READ FULL TEXT

page 1

page 2

page 3

research
04/20/2019

Language Models with Transformers

The Transformer architecture is superior to RNN-based models in computat...
research
05/29/2023

Transformer Language Models Handle Word Frequency in Prediction Head

Prediction head is a crucial component of Transformer language models. D...
research
05/19/2021

Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?

The automatic detection of humor poses a grand challenge for natural lan...
research
02/27/2021

Transformers with Competitive Ensembles of Independent Mechanisms

An important development in deep learning from the earliest MLPs has bee...
research
10/06/2020

Stepwise Extractive Summarization and Planning with Structured Transformers

We propose encoder-centric stepwise models for extractive summarization ...
research
06/09/2021

Auto-tagging of Short Conversational Sentences using Natural Language Processing Methods

In this study, we aim to find a method to auto-tag sentences specific to...
research
12/16/2022

Plansformer: Generating Symbolic Plans using Transformers

Large Language Models (LLMs) have been the subject of active research, s...

Please sign up or login with your details

Forgot password? Click here to reset