Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation

06/02/2023
by   Shravan Nayak, et al.
0

NMT systems trained on Pre-trained Multilingual Sequence-Sequence (PMSS) models flounder when sufficient amounts of parallel data is not available for fine-tuning. This specifically holds for languages missing/under-represented in these models. The problem gets aggravated when the data comes from different domains. In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is extremely beneficial for domain-specific NMT, especially when target domain data is limited/unavailable and the considered languages are missing or under-represented in the PMSS model. We quantify the domain-specific results variations using a domain-divergence test, and show that ITFT can mitigate the impact of domain divergence to some extent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?

What can pre-trained multilingual sequence-to-sequence models like mBART...
research
07/06/2019

Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation

This paper proposes a novel multilingual multistage fine-tuning approach...
research
09/07/2021

IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages

In this paper we present IndicBART, a multilingual, sequence-to-sequence...
research
12/10/2020

Exploring Pair-Wise NMT for Indian Languages

In this paper, we address the task of improving pair-wise machine transl...
research
08/20/2022

General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Training keyphrase generation (KPG) models requires a large amount of an...
research
03/01/2023

A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation

The effectiveness of Neural Machine Translation (NMT) models largely dep...
research
09/11/2022

Detecting Suicide Risk in Online Counseling Services: A Study in a Low-Resource Language

With the increased awareness of situations of mental crisis and their so...

Please sign up or login with your details

Forgot password? Click here to reset