Can You Traducir This? Machine Translation for Code-Switched Input

05/11/2021
by   Jitao Xu, et al.
0

Code-Switching (CSW) is a common phenomenon that occurs in multilingual geographic or social contexts, which raises challenging problems for natural language processing tools. We focus here on Machine Translation (MT) of CSW texts, where we aim to simultaneously disentangle and translate the two mixed languages. Due to the lack of actual translated CSW data, we generate artificial training data from regular parallel texts. Experiments show this training strategy yields MT systems that surpass multilingual systems for code-switched texts. These results are confirmed in an alternative task aimed at providing contextual translations for a L2 writing assistant.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

Quality Estimation of Machine Translated Texts based on Direct Evidence from Training Data

Current Machine Translation systems achieve very good results on a growi...
research
04/14/2020

Balancing Training for Multilingual Neural Machine Translation

When training multilingual machine translation (MT) models that can tran...
research
05/26/2023

Code-Switched Text Synthesis in Unseen Language Pairs

Existing efforts on text synthesis for code-switching mostly require tra...
research
03/04/2016

Parallel Texts in the Hebrew Bible, New Methods and Visualizations

In this article we develop an algorithm to detect parallel texts in the ...
research
10/31/2022

Domain Curricula for Code-Switched MT at MixMT 2022

In multilingual colloquial settings, it is a habitual occurrence to comp...
research
03/04/2019

Polylingual Wordnet

Princeton WordNet is one of the most important resources for natural lan...
research
12/28/2020

Towards Fully Automated Manga Translation

We tackle the problem of machine translation of manga, Japanese comics. ...

Please sign up or login with your details

Forgot password? Click here to reset