A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained Models

10/13/2022
by   Jimin Sun, et al.
0

Recent work on tokenizer-free multilingual pretrained models show promising results in improving cross-lingual transfer and reducing engineering overhead (Clark et al., 2022; Xue et al., 2022). However, these works mainly focus on reporting accuracy on a limited set of tasks and data settings, placing less emphasis on other important factors when tuning and deploying the models in practice, such as memory usage, inference speed, and fine-tuning data robustness. We attempt to fill this gap by performing a comprehensive empirical comparison of multilingual tokenizer-free and subword-based models considering these various dimensions. Surprisingly, we find that subword-based models might still be the most practical choice in many settings, achieving better performance for lower inference latency and memory usage. Based on these results, we encourage future work in tokenizer-free methods to consider these factors when designing and evaluating new models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2021

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

Multilingual pretrained language models have demonstrated remarkable zer...
research
03/15/2021

Multi-view Subword Regularization

Multilingual pretrained representations generally rely on subword segmen...
research
09/28/2021

Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking

Recent progress in task-oriented neural dialogue systems is largely focu...
research
01/21/2021

Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Pretrained multilingual text encoders based on neural Transformer archit...
research
04/05/2022

Towards Best Practices for Training Multilingual Dense Retrieval Models

Dense retrieval models using a transformer-based bi-encoder design have ...
research
09/10/2021

Efficient Test Time Adapter Ensembling for Low-resource Language Varieties

Adapters are light-weight modules that allow parameter-efficient fine-tu...
research
10/18/2022

Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

Translation has played a crucial role in improving the performance on mu...

Please sign up or login with your details

Forgot password? Click here to reset