Learning Better Universal Representations from Pre-trained Contextualized Language Models

04/29/2020
by   Yian Li, et al.
0

Pre-trained contextualized language models such as BERT have shown great effectiveness in a wide range of downstream natural language processing (NLP) tasks. However, the effective representations offered by the models target at each token inside a sequence rather than each sequence and the fine-tuning step involves the input of both sequences at one time, leading to unsatisfying representation of each individual sequence. Besides, as sentence-level representations taken as the full training context in these models, there comes inferior performance on lower-level linguistic units (phrases and words). In this work, we present a novel framework on BERT that is capable of generating universal, fixed-size representations for input sequences of any lengths, i.e., words, phrases, and sentences, using a large scale of natural language inference and paraphrase data with multiple training objectives. Our proposed framework adopts the Siamese network, learning sentence-level representations from natural language inference dataset and phrase and word-level representations from paraphrasing dataset, respectively. We evaluate our model across different granularity of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks, where our model substantially outperforms other representation models on sentence-level datasets and achieves significant improvements in word-level and phrase-level representation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2018

SufiSent - Universal Sentence Representations Using Suffix Encodings

Computing universal distributed representations of sentences is a fundam...
research
10/15/2021

Probing as Quantifying the Inductive Bias of Pre-trained Representations

Pre-trained contextual representations have led to dramatic performance ...
research
04/17/2021

Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

Cryptic crosswords, the dominant English-language crossword variety in t...
research
04/20/2023

Multi-aspect Repetition Suppression and Content Moderation of Large Language Models

Natural language generation is one of the most impactful fields in NLP, ...
research
11/07/2019

Probing Contextualized Sentence Representations with Visual Awareness

We present a universal framework to model contextualized sentence repres...
research
04/20/2015

Self-Adaptive Hierarchical Sentence Model

The ability to accurately model a sentence at varying stages (e.g., word...
research
12/28/2020

BURT: BERT-inspired Universal Representation from Learning Meaningful Segment

Although pre-trained contextualized language models such as BERT achieve...

Please sign up or login with your details

Forgot password? Click here to reset