Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

07/03/2021
by   Rahma Chaabouni, et al.
0

Despite their practical success, modern seq2seq architectures are unable to generalize systematically on several SCAN tasks. Hence, it is not clear if SCAN-style compositional generalization is useful in realistic NLP tasks. In this work, we study the benefit that such compositionality brings about to several machine translation tasks. We present several focused modifications of Transformer that greatly improve generalization capabilities on SCAN and select one that remains on par with a vanilla Transformer on a standard machine translation (MT) task. Next, we study its performance in low-resource settings and on a newly introduced distribution-shifted English-French translation task. Overall, we find that improvements of a SCAN-capable model do not directly transfer to the resource-rich MT setup. In contrast, in the low-resource setup, general modifications lead to an improvement of up to 13.1 vanilla Transformer. Similarly, an improvement of 14 metric is achieved in the introduced compositional English-French translation task. This provides experimental evidence that the compositional generalization assessed in SCAN is particularly useful in resource-starved and domain-shifted scenarios.

READ FULL TEXT
research
04/30/2020

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

Many valid translations exist for a given sentence, and yet machine tran...
research
03/24/2023

Towards Making the Most of ChatGPT for Machine Translation

ChatGPT shows remarkable capabilities for machine translation (MT). Seve...
research
12/24/2022

Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation

In this paper, we study the use of deep Transformer translation model fo...
research
09/28/2022

Effective General-Domain Data Inclusion for the Machine Translation Task by Vanilla Transformers

One of the vital breakthroughs in the history of machine translation is ...
research
09/13/2022

Data-adaptive Transfer Learning for Translation: A Case Study in Haitian and Jamaican

Multilingual transfer techniques often improve low-resource machine tran...
research
02/15/2023

Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation

Large language models (LLMs) demonstrate remarkable machine translation ...
research
10/07/2019

Compositional Generalization for Primitive Substitutions

Compositional generalization is a basic mechanism in human language lear...

Please sign up or login with your details

Forgot password? Click here to reset