Self-Improving-Leaderboard(SIL): A Call for Real-World Centric Natural Language Processing Leaderboards

03/20/2023
by   Chanjun Park, et al.
0

Leaderboard systems allow researchers to objectively evaluate Natural Language Processing (NLP) models and are typically used to identify models that exhibit superior performance on a given task in a predetermined setting. However, we argue that evaluation on a given test dataset is just one of many performance indications of the model. In this paper, we claim leaderboard competitions should also aim to identify models that exhibit the best performance in a real-world setting. We highlight three issues with current leaderboard systems: (1) the use of a single, static test set, (2) discrepancy between testing and real-world application (3) the tendency for leaderboard-centric competition to be biased towards the test set. As a solution, we propose a new paradigm of leaderboard systems that addresses these issues of current leaderboard system. Through this study, we hope to induce a paradigm shift towards more real -world-centric leaderboard competitions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2016

What to do about non-standard (or non-canonical) language in NLP

Real world data differs radically from the benchmark corpora we use in n...
research
06/13/2023

Survey on Sociodemographic Bias in Natural Language Processing

Deep neural networks often learn unintended biases during training, whic...
research
04/23/2020

DuReaderrobust: A Chinese Dataset Towards Evaluating the Robustness of Machine Reading Comprehension Models

Machine Reading Comprehension (MRC) is a crucial and challenging task in...
research
09/03/2021

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

The dominating NLP paradigm of training a strong neural predictor to per...
research
09/06/2019

Show Your Work: Improved Reporting of Experimental Results

Research in natural language processing proceeds, in part, by demonstrat...
research
10/29/2022

A Critical Reflection and Forward Perspective on Empathy and Natural Language Processing

We review the state of research on empathy in natural language processin...
research
11/10/2019

Location Attention for Extrapolation to Longer Sequences

Neural networks are surprisingly good at interpolating and perform remar...

Please sign up or login with your details

Forgot password? Click here to reset