DeepAI AI Chat
Log In Sign Up

Translator2Vec: Understanding and Representing Human Post-Editors

by   António Góis, et al.
Unbabel Inc.

The combination of machines and humans for translation is effective, with many studies showing productivity gains when humans post-edit machine-translated output instead of translating from scratch. To take full advantage of this combination, we need a fine-grained understanding of how human translators work, and which post-editing styles are more effective than others. In this paper, we release and analyze a new dataset with document-level post-editing action sequences, including edit operations from keystrokes, mouse actions, and waiting times. Our dataset comprises 66,268 full document sessions post-edited by 332 humans, the largest of the kind released to date. We show that action sequences are informative enough to identify post-editors accurately, compared to baselines that only look at the initial and final text. We build on this to learn and visualize continuous representations of post-editors, and we show that these representations improve the downstream task of predicting post-editing time.


page 1

page 2

page 3

page 4


IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion

We present IntelliCAT, an interactive translation interface with neural ...

Assessing Post-editing Effort in the English-Hindi Direction

We present findings from a first in-depth post-editing effort estimation...

Visual Story Post-Editing

We introduce the first dataset for human edits of machine-generated visu...

Manual Post-editing of Automatically Transcribed Speeches from the Icelandic Parliament - Althingi

The design objectives for an automatic transcription system are to produ...

Latexify Math: Mathematical Formula Markup Revision to Assist Collaborative Editing in Math Q A Sites

Collaborative editing questions and answers plays an important role in q...

Post-edit Analysis of Collective Biography Generation

Text generation is increasingly common but often requires manual post-ed...

PePe: Personalized Post-editing Model utilizing User-generated Post-edits

Incorporating personal preference is crucial in advanced machine transla...

Code Repositories