Hierarchical GPT with Congruent Transformers for Multi-Sentence Language Models

09/18/2020
by   Jihyeon Roh, et al.
0

We report a GPT-based multi-sentence language model for dialogue generation and document understanding. First, we propose a hierarchical GPT which consists of three blocks, i.e., a sentence encoding block, a sentence generating block, and a sentence decoding block. The sentence encoding and decoding blocks are basically the encoder-decoder blocks of the standard Transformers, which work on each sentence independently. The sentence generating block is inserted between the encoding and decoding blocks, and generates the next sentence embedding vector from the previous sentence embedding vectors. We believe it is the way human make conversation and understand paragraphs and documents. Since each sentence may consist of fewer words, the sentence encoding and decoding Transformers can use much smaller dimensional embedding vectors. Secondly, we note the attention in the Transformers utilizes the inner-product similarity measure. Therefore, to compare the two vectors in the same space, we set the transform matrices for queries and keys to be the same. Otherwise, the similarity concept is incongruent. We report experimental results to show that these two modifications increase the language model performance for tasks with multiple sentences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/24/2021

Using BERT Encoding and Sentence-Level Language Model for Sentence Ordering

Discovering the logical sequence of events is one of the cornerstones in...
research
11/10/2019

A Bilingual Generative Transformer for Semantic Sentence Embedding

Semantic sentence embedding models encode natural language sentences int...
research
08/08/2018

Natural Language Generation by Hierarchical Decoding with Linguistic Patterns

Natural language generation (NLG) is a critical component in spoken dial...
research
10/02/2019

Linking artificial and human neural representations of language

What information from an act of sentence understanding is robustly repre...
research
04/15/2021

Sentence-Permuted Paragraph Generation

Generating paragraphs of diverse contents is important in many applicati...
research
05/10/2022

Extracting Latent Steering Vectors from Pretrained Language Models

Prior work on controllable text generation has focused on learning how t...
research
05/07/2017

Generating Memorable Mnemonic Encodings of Numbers

The major system is a mnemonic system that can be used to memorize seque...

Please sign up or login with your details

Forgot password? Click here to reset