Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities

05/22/2023
by   Guangsheng Bao, et al.
0

Non-autoregressive translation (NAT) models have been extensively investigated within the context of sentence-level machine translation (MT) tasks, demonstrating comparable quality and superior translation speed when contrasted with autoregressive translation (AT) models. However, the challenges associated with multi-modality and alignment issues within NAT models become more prominent when increasing input and output length, leading to unexpected complications in document-level MT. In this paper, we conduct a comprehensive examination of typical NAT models in the context of document-level MT tasks. Experiments reveal that, although NAT models significantly accelerate text generation on documents, they do not perform as effectively as on sentences. To bridge this performance gap, we introduce a novel design that underscores the importance of sentence-level alignment for non-autoregressive document-level machine translation (NA-DMT). This innovation substantially reduces the performance discrepancy. However, it is worth noting that NA-DMT models are still far from perfect and may necessitate additional research to fully optimize their performance. We delve into the related opportunities and challenges and provide our code at https://github.com/baoguangsheng/nat-on-doc to stimulate further research in this field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2021

Can Latent Alignments Improve Autoregressive Machine Translation?

Latent alignment objectives such as CTC and AXE significantly improve no...
research
10/31/2019

Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019

Recently, neural models led to significant improvements in both machine ...
research
11/30/2022

Rephrasing the Reference for Non-Autoregressive Machine Translation

Non-autoregressive neural machine translation (NAT) models suffer from t...
research
05/18/2023

Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

Several recent papers claim human parity at sentence-level Machine Trans...
research
04/30/2020

Exploiting Sentence Order in Document Alignment

In this work, we exploit the simple idea that a document and its transla...
research
10/19/2020

Incorporating Terminology Constraints in Automatic Post-Editing

Users of machine translation (MT) may want to ensure the use of specific...
research
05/08/2023

Target-Side Augmentation for Document-Level Machine Translation

Document-level machine translation faces the challenge of data sparsity ...

Please sign up or login with your details

Forgot password? Click here to reset