News Headline Grouping as a Challenging NLU Task

05/12/2021
by   Philippe Laban, et al.
13

Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks. These impressive results have led the community to introspect on dataset limitations, and iterate on more nuanced challenges. In this paper, we introduce the task of HeadLine Grouping (HLG) and a corresponding dataset (HLGD) consisting of 20,056 pairs of news headlines, each labeled with a binary judgement as to whether the pair belongs within the same group. On HLGD, human annotators achieve high performance of around 0.9 F-1, while current state-of-the art Transformer models only reach 0.75 F-1, opening the path for further improvements. We further propose a novel unsupervised Headline Generator Swap model for the task of HeadLine Grouping that achieves within 3 F-1 of the best supervised model. Finally, we analyze high-performing models with consistency tests, and find that models are not consistent in their predictions, revealing modeling limits of current architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2020

DIET: Lightweight Language Understanding for Dialogue Systems

Large-scale pre-trained language models have shown impressive results on...
research
04/19/2019

Challenges and Prospects in Vision and Language Research

Language grounded image understanding tasks have often been proposed as ...
research
05/12/2023

ArtGPT-4: Artistic Vision-Language Understanding with Adapter-enhanced MiniGPT-4

In recent years, large language models (LLMs) have made significant prog...
research
05/22/2023

Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms

State-of-the-art neural models can now reach human performance levels ac...
research
04/15/2021

XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation

Machine learning has brought striking advances in multilingual natural l...
research
12/07/2020

Improvements and Extensions on Metaphor Detection

Metaphors are ubiquitous in human language. The metaphor detection task ...
research
06/01/2021

Comparing Test Sets with Item Response Theory

Recent years have seen numerous NLP datasets introduced to evaluate the ...

Please sign up or login with your details

Forgot password? Click here to reset