Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

07/08/2022
by   Ziheng Zeng, et al.
0

Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today's state-of-the-art. Prior work has identified deficiencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25 higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.

READ FULL TEXT
research
06/16/2020

EPIE Dataset: A Corpus For Possible Idiomatic Expressions

Idiomatic expressions have always been a bottleneck for language compreh...
research
10/24/2020

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

Sequence-to-sequence models excel at handling natural language variation...
research
11/09/2022

Prompting Large Pre-trained Vision-Language Models For Compositional Concept Learning

This work explores the zero-shot compositional learning ability of large...
research
10/07/2021

UoB at SemEval-2021 Task 5: Extending Pre-Trained Language Models to Include Task and Domain-Specific Information for Toxic Span Prediction

Toxicity is pervasive in social media and poses a major threat to the he...
research
05/26/2021

LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

Distributional semantics based on neural approaches is a cornerstone of ...
research
08/01/2020

Trojaning Language Models for Fun and Profit

Recent years have witnessed a new paradigm of building natural language ...
research
01/31/2023

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

A recent line of work in NLP focuses on the (dis)ability of models to ge...

Please sign up or login with your details

Forgot password? Click here to reset