Average Is Not Enough: Caveats of Multilingual Evaluation

01/03/2023
by   Matúš Pikuliak, et al.
0

This position paper discusses the problem of multilingual evaluation. Using simple statistics, such as average language performance, might inject linguistic biases in favor of dominant language families into evaluation methodology. We argue that a qualitative analysis informed by comparative linguistics is needed for multilingual results to detect this kind of bias. We show in our case study that results in published works can indeed be linguistically biased and we demonstrate that visualization based on URIEL typological database can detect it.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2022

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

Although recent Massively Multilingual Language Models (MMLMs) like mBER...
research
09/27/2016

AP16-OL7: A Multilingual Database for Oriental Languages and A Language Recognition Baseline

We present the AP16-OL7 database which was released as the training and ...
research
10/11/2022

Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models

While multilingual language models can improve NLP performance on low-re...
research
05/18/2023

Comparing Biases and the Impact of Multilingual Training across Multiple Languages

Studies in bias and fairness in natural language processing have primari...
research
05/24/2023

This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models

We introduce the notion of geopolitical bias – a tendency to report diff...
research
07/25/2023

Towards Bridging the Digital Language Divide

It is a well-known fact that current AI-based language technology – lang...
research
02/16/2017

Fast and unsupervised methods for multilingual cognate clustering

In this paper we explore the use of unsupervised methods for detecting c...

Please sign up or login with your details

Forgot password? Click here to reset