SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

05/02/2019
by   Alex Wang, et al.
4

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently come close to the level of non-expert humans, suggesting limited headroom for further research. This paper recaps lessons learned from the GLUE benchmark and presents SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard. SuperGLUE will be available soon at super.gluebenchmark.com.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2021

The Catalan Language CLUB

The Catalan Language Understanding Benchmark (CLUB) encompasses various ...
research
04/20/2018

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

For natural language understanding (NLU) technology to be maximally usef...
research
02/15/2022

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

In the last year, new neural architectures and multilingual pre-trained ...
research
05/24/2019

Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark

The GLUE benchmark (Wang et al., 2019b) is a suite of language understan...
research
05/24/2019

Human vs. Muppet: A Conservative Estimate of HumanPerformance on the GLUE Benchmark

The GLUE benchmark (Wang et al., 2019b) is a suite of language understan...
research
11/02/2021

Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks

Natural language understanding (NLU) has made massive progress driven by...
research
04/07/2020

Evaluating Machines by their Real-World Language Use

There is a fundamental gap between how humans understand and use languag...

Please sign up or login with your details

Forgot password? Click here to reset