Overcoming Poor Word Embeddings with Word Definitions

03/05/2021
by   Christopher Malon, et al.
0

Modern natural language understanding models depend on pretrained subword embeddings, but applications may need to reason about words that were never or rarely seen during pretraining. We show that examples that depend critically on a rarer word are more challenging for natural language inference models. Then we explore how a model could learn to use definitions, provided in natural text, to overcome this handicap. Our model's understanding of a definition is usually weaker than a well-modeled word embedding, but it recovers most of the performance gap from using a completely untrained word.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2022

CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment

Pretrained language models (PLMs) have achieved superhuman performance o...
research
09/19/2019

Multi-sense Definition Modeling using Word Sense Decompositions

Word embeddings capture syntactic and semantic information about words. ...
research
12/01/2016

Definition Modeling: Learning to define word embeddings in natural language

Distributed representations of words have been shown to capture lexical ...
research
11/08/2016

Cruciform: Solving Crosswords with Natural Language Processing

Crossword puzzles are popular word games that require not only a large v...
research
10/03/2020

Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles

Recently, the automatic prediction of personality traits has received in...
research
06/23/2019

Smaller Text Classifiers with Discriminative Cluster Embeddings

Word embedding parameters often dominate overall model sizes in neural m...
research
02/24/2022

Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words

The standard BERT adopts subword-based tokenization, which may break a w...

Please sign up or login with your details

Forgot password? Click here to reset