Automatic Classification of Human Translation and Machine Translation: A Study from the Perspective of Lexical Diversity

05/10/2021
by   Yingxue Fu, et al.
0

By using a trigram model and fine-tuning a pretrained BERT model for sequence classification, we show that machine translation and human translation can be classified with an accuracy above chance level, which suggests that machine translation and human translation are different in a systematic way. The classification accuracy of machine translation is much higher than of human translation. We show that this may be explained by the difference in lexical diversity between machine translation and human translation. If machine translation has independent patterns from human translation, automatic metrics which measure the deviation of machine translation from human translation may conflate difference with quality. Our experiment with two different types of automatic metrics shows correlation with the result of the classification task. Therefore, we suggest the difference in lexical diversity between machine translation and human translation be given more attention in machine translation evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2022

Exploring Diversity in Back Translation for Low-Resource Machine Translation

Back translation is one of the most widely used methods for improving th...
research
11/02/2020

The 2020s Political Economy of Machine Translation

This paper explores the hypothesis that the diversity of human languages...
research
04/11/2022

Toward More Effective Human Evaluation for Machine Translation

Improvements in text generation technologies such as machine translation...
research
07/23/2019

Semantic Web for Machine Translation: Challenges and Directions

A large number of machine translation approaches have recently been deve...
research
10/11/2018

Simple and Effective Text Simplification Using Semantic and Neural Methods

Sentence splitting is a major simplification operator. Here we present a...
research
10/05/2017

Machine Translation Evaluation with Neural Networks

We present a framework for machine translation evaluation using neural n...
research
06/28/2019

Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation

This work presents an empirical approach to quantifying the loss of lexi...

Please sign up or login with your details

Forgot password? Click here to reset