SemMT: A Semantic-based Testing Approach for Machine Translation Systems

12/03/2020
by   Jialun Cao, et al.
0

Machine translation has wide applications in daily life. In mission-critical applications such as translating official documents, incorrect translation can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine translation systems. Existing methodologies mostly rely on metamorphic relations designed at the textual level (e.g., Levenshtein distance) or syntactic level (e.g., the distance between grammar structures) to determine the correctness of translation results. However, these metamorphic relations do not consider whether the original and translated sentences have the same meaning (i.e., Semantic similarity). Therefore, in this paper, we propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking. SemMT applies round-trip translation and measures the semantic similarity between the original and translated sentences. Our insight is that the semantics expressed by the logic and numeric constraint in sentences can be captured using regular expressions (or deterministic finite automata) where efficient equivalence/similarity checking algorithms are available. Leveraging the insight, we propose three semantic similarity metrics and implement them in SemMT. The experiment result reveals SemMT can achieve higher effectiveness compared with state-of-the-art works, achieving an increase of 21 accuracy and F-Score, respectively. We also explore potential improvements that can be achieved when proper combinations of metrics are adopted. Finally, we discuss a solution to locate the suspicious trip in round-trip translation, which may shed lights on further exploration.

READ FULL TEXT

page 3

page 15

research
10/31/2021

Quality Estimation Using Round-trip Translation with Sentence Embeddings

Estimating the quality of machine translation systems has been an ongoin...
research
07/19/2019

Structure-Invariant Testing for Machine Translation

In recent years, machine translation software has increasingly been inte...
research
09/03/2014

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation

The authors of (Cho et al., 2014a) have shown that the recently introduc...
research
08/21/2018

Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation

Recent research suggests that neural machine translation achieves parity...
research
10/04/2017

Discourse Structure in Machine Translation Evaluation

In this article, we explore the potential of using sentence-level discou...
research
10/06/2022

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

Identifying semantically equivalent sentences is important for many cros...
research
06/03/2019

Transforming Complex Sentences into a Semantic Hierarchy

We present an approach for recursively splitting and rephrasing complex ...

Please sign up or login with your details

Forgot password? Click here to reset