Grounded Compositional Outputs for Adaptive Language Modeling

09/24/2020
by   Nikolaos Pappas, et al.
0

Language models have emerged as a central component across NLP, and a great deal of progress depends on the ability to cheaply adapt them (e.g., through finetuning) to new domains and tasks. A language model's vocabulary—typically selected before training and permanently fixed later—affects its size and is part of what makes it resistant to such adaptation. Prior work has used compositional input embeddings based on surface forms to ameliorate this issue. In this work, we go one step beyond and propose a fully compositional output embedding layer for language models, which is further grounded in information from a structured lexicon (WordNet), namely semantically related words and free-text definitions. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary. We evaluate the model on conventional language modeling as well as challenging cross-domain settings with an open vocabulary, finding that it matches or outperforms previous state-of-the-art output embedding methods and adaptation approaches. Our analysis attributes the improvements to sample efficiency: our model is more accurate for low-frequency words.

READ FULL TEXT

page 15

page 16

research
12/10/2016

A Character-Word Compositional Neural Language Model for Finnish

Inspired by recent research, we explore ways to model the highly morphol...
research
09/29/2020

Improving Low Compute Language Modeling with In-Domain Embedding Initialisation

Many NLP applications, such as biomedical data and technical support, ha...
research
08/31/2019

Behavior Gated Language Models

Most current language modeling techniques only exploit co-occurrence, se...
research
08/18/2017

Syllable-level Neural Language Model for Agglutinative Language

Language models for agglutinative languages have always been hindered in...
research
11/26/2020

Unigram-Normalized Perplexity as a Language Model Performance Measure with Different Vocabulary Sizes

Although Perplexity is a widely used performance metric for language mod...
research
02/28/2023

Linear Spaces of Meanings: the Compositional Language of VLMs

We investigate compositional structures in vector data embeddings from p...
research
11/23/2022

Word-Level Representation From Bytes For Language Modeling

Modern language models mostly take sub-words as input, a design that bal...

Please sign up or login with your details

Forgot password? Click here to reset