Sentence Ambiguity, Grammaticality and Complexity Probes

10/13/2022
by   Sunit Bhattacharya, et al.
12

It is unclear whether, how and where large pre-trained language models capture subtle linguistic traits like ambiguity, grammaticality and sentence complexity. We present results of automatic classification of these traits and compare their viability and patterns across representation types. We demonstrate that template-based datasets with surface-level artifacts should not be used for probing, careful comparisons with baselines should be done and that t-SNE plots should not be used to determine the presence of a feature among dense vectors representations. We also show how features might be highly localized in the layers for these models and get lost in the upper layers.

READ FULL TEXT
research
10/06/2020

Analyzing Individual Neurons in Pre-trained Language Models

While a lot of analysis has been carried to demonstrate linguistic knowl...
research
07/15/2019

Myers-Briggs Personality Classification and Personality-Specific Language Generation Using Pre-trained Language Models

The Myers-Briggs Type Indicator (MBTI) is a popular personality metric t...
research
09/11/2022

Probing for Understanding of English Verb Classes and Alternations in Large Pre-trained Language Models

We investigate the extent to which verb alternation classes, as describe...
research
04/08/2021

A Simple Geometric Method for Cross-Lingual Linguistic Transformations with Pre-trained Autoencoders

Powerful sentence encoders trained for multiple languages are on the ris...
research
12/20/2022

Identifying and Manipulating the Personality Traits of Language Models

Psychology research has long explored aspects of human personality such ...
research
02/25/2022

Exploring Multi-Modal Representations for Ambiguity Detection Coreference Resolution in the SIMMC 2.0 Challenge

Anaphoric expressions, such as pronouns and referential descriptions, ar...
research
03/01/2021

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Current NLP datasets targeting ambiguity can be solved by a native speak...

Please sign up or login with your details

Forgot password? Click here to reset