Linguistic Features of Genre and Method Variation in Translation: A Computational Perspective

In this paper we describe the use of text classification methods to investigate genre and method variation in an English - German translation corpus. For this purpose we use linguistically motivated features representing texts using a combination of part-of-speech tags arranged in bigrams, trigrams, and 4-grams. The classification method used in this paper is a Bayesian classifier with Laplace smoothing. We use the output of the classifiers to carry out an extensive feature analysis on the main difference between genres and methods of translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2022

LibriS2S: A German-English Speech-to-Speech Translation Corpus

Recently, we have seen an increasing interest in the area of speech-to-t...
research
08/02/2022

Silo NLP's Participation at WAT2022

This paper provides the system description of "Silo NLP's" submission to...
research
09/30/2016

Modeling Language Change in Historical Corpora: The Case of Portuguese

This paper presents a number of experiments to model changes in a histor...
research
04/08/2022

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

This paper introduces GigaST, a large-scale pseudo speech translation (S...
research
03/29/2022

Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications

In speech communication, how something is said (paralinguistic informati...
research
09/11/2016

Unsupervised Identification of Translationese

Translated texts are distinctively different from original ones, to the ...
research
04/29/2021

Recognition and Processing of NATOM

In this paper we show how to process the NOTAM (Notice to Airmen) data o...

Please sign up or login with your details

Forgot password? Click here to reset