Detecting Machine-Translated Text using Back Translation

10/15/2019
by   Hoang-Quoc Nguyen-Son, et al.
0

Machine-translated text plays a crucial role in the communication of people using different languages. However, adversaries can use such text for malicious purposes such as plagiarism and fake review. The existing methods detected a machine-translated text only using the text's intrinsic content, but they are unsuitable for classifying the machine-translated and human-written texts with the same meanings. We have proposed a method to extract features used to distinguish machine/human text based on the similarity between the intrinsic text and its back-translation. The evaluation of detecting translated sentences with French shows that our method achieves 75.0 It outperforms the existing methods whose the best accuracy is 62.8 F-score is 62.7 back-translated text with 83.4 best previous accuracy. We also achieve similar results not only with F-score but also with similar experiments related to Japanese. Moreover, we prove that our detector can recognize both machine-translated and machine-back-translated texts without the language information which is used to generate these machine texts. It demonstrates the persistence of our method in various applications in both low- and rich-resource languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2019

Detecting Machine-Translated Paragraphs by Matching Similar Words

Machine-translated text plays an important role in modern life by smooth...
research
09/11/2016

Unsupervised Identification of Translationese

Translated texts are distinctively different from original ones, to the ...
research
09/23/2015

Fully automatic multi-language translation with a catalogue of phrases - successful employment for the Swiss avalanche bulletin

The Swiss avalanche bulletin is produced twice a day in four languages. ...
research
12/19/2019

Identifying Adversarial Sentences by Analyzing Text Complexity

Attackers create adversarial text to deceive both human perception and t...
research
12/28/2018

Identifying Computer-Translated Paragraphs using Coherence Features

We have developed a method for extracting the coherence features from a ...
research
05/20/2018

The UN Parallel Corpus Annotated for Translation Direction

This work distinguishes between translated and original text in the UN p...
research
04/23/2023

Translationese Reduction using Abstract Meaning Representation

Translated texts or utterances bear several hallmarks distinct from text...

Please sign up or login with your details

Forgot password? Click here to reset