On the Calibration of Massively Multilingual Language Models

10/21/2022
by   Kabir Ahuja, et al.
0

Massively Multilingual Language Models (MMLMs) have recently gained popularity due to their surprising effectiveness in cross-lingual transfer. While there has been much work in evaluating these models for their performance on a variety of tasks and languages, little attention has been paid on how well calibrated these models are with respect to the confidence in their predictions. We first investigate the calibration of MMLMs in the zero-shot setting and observe a clear case of miscalibration in low-resource languages or those which are typologically diverse from English. Next, we empirically show that calibration methods like temperature scaling and label smoothing do reasonably well towards improving calibration in the zero-shot scenario. We also find that few-shot examples in the language can further help reduce the calibration errors, often substantially. Overall, our work contributes towards building more reliable multilingual models by highlighting the issue of their miscalibration, understanding what language and model specific factors influence it, and pointing out the strategies to improve the same.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Multilingual Pretrained Language Models (MPLMs) have shown their strong ...
research
05/20/2022

Prototypical Calibration for Few-shot Learning of Language Models

In-context learning of GPT-like models has been recognized as fragile ac...
research
12/10/2020

Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Prior work on multilingual question answering has mostly focused on usin...
research
05/24/2023

Meta-Learning For Vision-and-Language Cross-lingual Transfer

Current pre-trained vison-language models (PVLMs) achieve excellent perf...
research
11/29/2022

TyDiP: A Dataset for Politeness Classification in Nine Typologically Diverse Languages

We study politeness phenomena in nine typologically diverse languages. P...
research
10/05/2021

Analyzing the Effects of Reasoning Types on Cross-Lingual Transfer Performance

Multilingual language models achieve impressive zero-shot accuracies in ...
research
05/23/2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers

Multilingual Large Language Models (LLMs) have recently shown great capa...

Please sign up or login with your details

Forgot password? Click here to reset