Unsupervised Clustering of Commercial Domains for Adaptive Machine Translation

12/14/2016
by   Mauro Cettolo, et al.
0

In this paper, we report on domain clustering in the ambit of an adaptive MT architecture. A standard bottom-up hierarchical clustering algorithm has been instantiated with five different distances, which have been compared, on an MT benchmark built on 40 commercial domains, in terms of dendrograms, intrinsic and extrinsic evaluations. The main outcome is that the most expensive distance is also the only one able to allow the MT engine to guarantee good performance even with few, but highly populated clusters of domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2018

Off-the-Shelf Unsupervised NMT

We frame unsupervised machine translation (MT) in the context of multi-t...
research
12/15/2021

Faster Nearest Neighbor Machine Translation

kNN based neural machine translation (kNN-MT) has achieved state-of-the-...
research
02/16/2023

Evaluating and Improving the Coreference Capabilities of Machine Translation Models

Machine translation (MT) requires a wide range of linguistic capabilitie...
research
10/20/2014

Using Mechanical Turk to Build Machine Translation Evaluation Sets

Building machine translation (MT) test sets is a relatively expensive ta...
research
02/22/2022

RuCLIP – new models and experiments: a technical report

In the report we propose six new implementations of ruCLIP model trained...
research
02/20/2021

Machine Translation Customization via Automatic Training Data Selection from the Web

Machine translation (MT) systems, especially when designed for an indust...
research
02/16/2018

Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT

Although measuring intrinsic quality has been a key factor in the advanc...

Please sign up or login with your details

Forgot password? Click here to reset