NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures

04/28/2022
by   Jannis Vamvas, et al.
0

Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation. Translation-based similarity measures include direct and pivot translation probability, as well as translation cross-likelihood, which has not been studied so far. We analyze these measures in the common framework of multilingual NMT, releasing the NMTScore library (available at https://github.com/ZurichNLP/nmtscore). Compared to baselines such as sentence embeddings, translation-based measures prove competitive in paraphrase identification and are more robust against adversarial or multilingual input, especially if proper normalization is applied. When used for reference-based evaluation of data-to-text generation in 2 tasks and 17 languages, translation-based measures show a relatively high correlation to human judgments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2022

The first neural machine translation system for the Erzya language

We present the first neural machine translation system for translation b...
research
12/18/2022

Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data

As more and more conversational and translation systems are deployed in ...
research
11/15/2016

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

In this paper, we present our first attempts in building a multilingual ...
research
03/27/2023

Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation

Neural machine translation (NMT) has progressed rapidly over the past se...
research
10/11/2022

Checks and Strategies for Enabling Code-Switched Machine Translation

Code-switching is a common phenomenon among multilingual speakers, where...
research
08/11/2020

Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity

Recent work has shown that a multilingual neural machine translation (NM...
research
12/16/2022

Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better

While the problem of hallucinations in neural machine translation has lo...

Please sign up or login with your details

Forgot password? Click here to reset