Comparing Formulaic Language in Human and Machine Translation: Insight from a Parliamentary Corpus

06/22/2022
by   Yves Bestgen, et al.
0

A recent study has shown that, compared to human translations, neural machine translations contain more strongly-associated formulaic sequences made of relatively high-frequency words, but far less strongly-associated formulaic sequences made of relatively rare words. These results were obtained on the basis of translations of quality newspaper articles in which human translations can be thought to be not very literal. The present study attempts to replicate this research using a parliamentary corpus. The text were translated from French to English by three well-known neural machine translation systems: DeepL, Google Translate and Microsoft Translator. The results confirm the observations on the news corpus, but the differences are less strong. They suggest that the use of text genres that usually result in more literal translations, such as parliamentary corpora, might be preferable when comparing human and machine translations. Regarding the differences between the three neural machine systems, it appears that Google translations contain fewer highly collocational bigrams, identified by the CollGram technique, than Deepl and Microsoft translations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2021

Using CollGram to Compare Formulaic Language in Human and Neural Machine Translation

A comparison of formulaic sequences in human and neural machine translat...
research
05/06/2021

Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases

In this paper, we present a quantitative evaluation of differences betwe...
research
11/24/2021

Cultural and Geographical Influences on Image Translatability of Words across Languages

Neural Machine Translation (NMT) models have been observed to produce po...
research
02/28/2018

Analyzing Uncertainty in Neural Machine Translation

Machine translation is a popular test bed for research in neural sequenc...
research
03/28/2019

Train, Sort, Explain: Learning to Diagnose Translation Models

Evaluating translation models is a trade-off between effort and detail. ...
research
11/04/2019

Analysing Coreference in Transformer Outputs

We analyse coreference phenomena in three neural machine translation sys...
research
09/14/2017

Machine-Translation History and Evolution: Survey for Arabic-English Translations

As a result of the rapid changes in information and communication techno...

Please sign up or login with your details

Forgot password? Click here to reset