Log In Sign Up

Contrastive Learning for Context-aware Neural Machine TranslationUsing Coreference Information

by   Yongkeun Hwang, et al.

Context-aware neural machine translation (NMT) incorporates contextual information of surrounding texts, that can improve the translation quality of document-level machine translation. Many existing works on context-aware NMT have focused on developing new model architectures for incorporating additional contexts and have shown some promising results. However, most existing works rely on cross-entropy loss, resulting in limited use of contextual information. In this paper, we propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences. By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency. We experimented with our method on common context-aware NMT models and two document-level translation tasks. In the experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks. We also show that our method significantly improves coreference resolution in the English-German contrastive test suite.


page 1

page 2

page 3

page 4


Diving Deep into Context-Aware Neural Machine Translation

Context-aware neural machine translation (NMT) is a promising direction ...

Using Whole Document Context in Neural Machine Translation

In Machine Translation, considering the document as a whole can help to ...

Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns

The advent of context-aware NMT has resulted in promising improvements i...

Divide and Rule: Training Context-Aware Multi-Encoder Translation Models with Little Resources

Multi-encoder models are a broad family of context-aware Neural Machine ...

Deep Context-Aware Novelty Detection

A common assumption of novelty detection is that the distribution of bot...

Implicit Context-aware Learning and Discovery for Streaming Data Analytics

The performance of machine learning model can be further improved if con...

Modeling Bilingual Conversational Characteristics for Neural Chat Translation

Neural chat translation aims to translate bilingual conversational text,...