Detecting Machine-Translated Paragraphs by Matching Similar Words

04/24/2019
by   Hoang-Quoc Nguyen-Son, et al.
0

Machine-translated text plays an important role in modern life by smoothing communication from various communities using different languages. However, unnatural translation may lead to misunderstanding, a detector is thus needed to avoid the unfortunate mistakes. While a previous method measured the naturalness of continuous words using a N-gram language model, another method matched noncontinuous words across sentences but this method ignores such words in an individual sentence. We have developed a method matching similar words throughout the paragraph and estimating the paragraph-level coherence, that can identify machine-translated text. Experiment evaluates on 2000 English human-generated and 2000 English machine-translated paragraphs from German showing that the coherence-based method achieves high performance (accuracy = 87.0 methods (best accuracy = 72.4 on Dutch and Japanese obtain 89.2 results demonstrate the persistence of the proposed method in various languages with different resource levels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2018

Identifying Computer-Translated Paragraphs using Coherence Features

We have developed a method for extracting the coherence features from a ...
research
10/15/2019

Detecting Machine-Translated Text using Back Translation

Machine-translated text plays a crucial role in the communication of peo...
research
12/19/2019

Identifying Adversarial Sentences by Analyzing Text Complexity

Attackers create adversarial text to deceive both human perception and t...
research
12/03/2021

Translating Politeness Across Cultures: Case of Hindi and English

In this paper, we present a corpus based study of politeness across two ...
research
10/12/2021

Evaluation of Abstractive Summarisation Models with Machine Translation in Deliberative Processes

We present work on summarising deliberative processes for non-English la...
research
02/03/2023

Towards a responsible machine learning approach to identify forced labor in fisheries

Many fishing vessels use forced labor, but identifying vessels that enga...
research
05/18/2023

Evaluating the validity of a German translation of an uncanniness questionnaire

When researching on the acceptance of robots in Human-Robot-Interaction ...

Please sign up or login with your details

Forgot password? Click here to reset