ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

07/30/2021
by   Shiyue Zhang, et al.
22

We introduce ChrEnTranslate, an online machine translation demonstration system for translation between English and an endangered language Cherokee. It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability, two user feedback interfaces for experts and common users respectively, example inputs to collect human translations for monolingual data, word alignment visualization, and relevant terms from the Cherokee-English dictionary. The quantitative evaluation demonstrates that our backbone translation models achieve state-of-the-art translation performance and our quality estimation well correlates with both BLEU and human judgment. By analyzing 216 pieces of expert feedback, we find that NMT is preferable because it copies less than SMT, and, in general, current models can translate fragments of the source sentence but make major mistakes. When we add these 216 expert-corrected parallel texts back into the training set and retrain models, equal or slightly better performance is observed, which indicates the potential of human-in-the-loop learning. Our online demo is at https://chren.cs.unc.edu/ , our code is open-sourced at https://github.com/ZhangShiyue/ChrEnTranslate , and our data is available at https://github.com/ZhangShiyue/ChrEn

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2022

The first neural machine translation system for the Erzya language

We present the first neural machine translation system for translation b...
research
10/05/2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

Pre-training (PT) and back-translation (BT) are two simple and powerful ...
research
10/09/2020

ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization

Cherokee is a highly endangered Native American language spoken by the C...
research
05/26/2023

Songs Across Borders: Singable and Controllable Neural Lyric Translation

The development of general-domain neural machine translation (NMT) metho...
research
06/01/2023

BiSync: A Bilingual Editor for Synchronized Monolingual Texts

In our globalized world, a growing number of situations arise where peop...
research
05/04/2023

What changes when you randomly choose BPE merge operations? Not much

We introduce three simple randomized variants of byte pair encoding (BPE...
research
07/15/2023

Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

In this study, we present a novel dataset for training machine learning ...

Please sign up or login with your details

Forgot password? Click here to reset