Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

05/12/2022
by   Kabir Ahuja, et al.
0

Although recent Massively Multilingual Language Models (MMLMs) like mBERT and XLMR support around 100 languages, most existing multilingual NLP benchmarks provide evaluation data in only a handful of these languages with little linguistic diversity. We argue that this makes the existing practices in multilingual evaluation unreliable and does not provide a full picture of the performance of MMLMs across the linguistic landscape. We propose that the recent work done in Performance Prediction for NLP tasks can serve as a potential solution in fixing benchmarking in Multilingual NLP by utilizing features related to data and language typology to estimate the performance of an MMLM on different languages. We compare performance prediction with translating test data with a case study on four different multilingual datasets, and observe that these methods can provide reliable estimates of the performance that are often on-par with the translation based approaches, without the need for any additional translation as well as evaluation costs.

READ FULL TEXT
research
09/07/2018

Multitask and Multilingual Modelling for Lexical Analysis

In Natural Language Processing (NLP), one traditionally considers a sing...
research
10/17/2021

Predicting the Performance of Multilingual NLP Models

Recent advancements in NLP have given us models like mBERT and XLMR that...
research
05/19/2023

Evaluating task understanding through multilingual consistency: A ChatGPT case study

At the staggering pace with which the capabilities of large language mod...
research
01/03/2023

Average Is Not Enough: Caveats of Multilingual Evaluation

This position paper discusses the problem of multilingual evaluation. Us...
research
10/24/2022

Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation

Linguistic analysis of language models is one of the ways to explain and...
research
10/16/2022

Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World

Linguistic disparity in the NLP world is a problem that has been widely ...
research
07/27/2023

Models of reference production: How do they withstand the test of time?

In recent years, many NLP studies have focused solely on performance imp...

Please sign up or login with your details

Forgot password? Click here to reset