GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation

05/31/2021
by   Huayang Li, et al.
5

Computer-aided translation (CAT), the use of software to assist a human translator in the translation process, has been proven to be useful in enhancing the productivity of human translators. Autocompletion, which suggests translation results according to the text pieces provided by human translators, is a core function of CAT. There are two limitations in previous research in this line. First, most research works on this topic focus on sentence-level autocompletion (i.e., generating the whole translation as a sentence based on human input), but word-level autocompletion is under-explored so far. Second, almost no public benchmarks are available for the autocompletion task of CAT. This might be among the reasons why research progress in CAT is much slower compared to automatic MT. In this paper, we propose the task of general word-level autocompletion (GWLAN) from a real-world CAT scenario, and construct the first public benchmark to facilitate research in this topic. In addition, we propose an effective method for GWLAN and compare it with several strong baselines. Experiments demonstrate that our proposed method can give significantly more accurate predictions than the baseline methods on our benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2018

UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks

We describe the Universitat d'Alacant submissions to the word- and sente...
research
04/02/2017

Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings

One of the most important problems in machine translation (MT) evaluatio...
research
05/27/2021

TranSmart: A Practical Interactive Machine Translation System

Automatic machine translation is super efficient to produce translations...
research
12/19/2022

WACO: Word-Aligned Contrastive Learning for Speech Translation

End-to-end Speech Translation (E2E ST) aims to translate source speech i...
research
09/13/2022

Rethink about the Word-level Quality Estimation for Machine Translation from Human Judgement

Word-level Quality Estimation (QE) of Machine Translation (MT) aims to f...
research
03/04/2022

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

Simultaneous Machine Translation is the task of incrementally translatin...
research
07/01/2020

OrchideaSOL: a dataset of extended instrumental techniques for computer-aided orchestration

This paper introduces OrchideaSOL, a free dataset of samples of extended...

Please sign up or login with your details

Forgot password? Click here to reset