Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It

09/15/2022
by   Lifeng Han, et al.
0

Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks. Extra-large PLMs (xLPLMs) are proposed very recently to claim supreme performances over smaller-sized PLMs such as in machine translation (MT) tasks. These xLPLMs include Meta-AI's wmt21-dense-24-wide-en-X (2021) and NLLB (2022). In this work, we examine if xLPLMs are absolutely superior to smaller-sized PLMs in fine-tuning toward domain-specific MTs. We use two different in-domain data of different sizes: commercial automotive in-house data and clinical shared task data from the ClinSpEn2022 challenge at WMT2022. We choose popular Marian Helsinki as smaller sized PLM and two massive-sized Mega-Transformers from Meta-AI as xLPLMs. Our experimental investigation shows that 1) on smaller sized in-domain commercial automotive data, xLPLM wmt21-dense-24-wide-en-X indeed shows much better evaluation scores using SacreBLEU and hLEPOR metrics than smaller-sized Marian, even though its score increase rate is lower than Marian after fine-tuning; 2) on relatively larger-size well prepared clinical data fine-tuning, the xLPLM NLLB tends to lose its advantage over smaller-sized Marian on two sub-tasks (clinical terms and ontology concepts) using ClinSpEn offered metrics METEOR, COMET, and ROUGE-L, and totally lost to Marian on Task-1 (clinical cases) on all official metrics including SacreBLEU and BLEU; 3) metrics do not always agree with each other on the same tasks using the same model outputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2022

Using Massive Multilingual Pre-Trained Language Models Towards Real Zero-Shot Neural Machine Translation in Clinical Domain

Massively multilingual pre-trained language models (MMPLMs) are develope...
research
04/30/2020

Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation

There has been recent success in pre-training on monolingual data and fi...
research
08/07/2023

MedMine: Examining Pre-trained Language Models on Medication Mining

Automatic medication mining from clinical and biomedical text has become...
research
04/19/2022

PICT@DravidianLangTech-ACL2022: Neural Machine Translation On Dravidian Languages

This paper presents a summary of the findings that we obtained based on ...
research
10/26/2022

Robust Domain Adaptation for Pre-trained Multilingual Neural Machine Translation Models

Recent literature has demonstrated the potential of multilingual Neural ...
research
09/14/2021

Netmarble AI Center's WMT21 Automatic Post-Editing Shared Task Submission

This paper describes Netmarble's submission to WMT21 Automatic Post-Edit...

Please sign up or login with your details

Forgot password? Click here to reset