DeepAI AI Chat
Log In Sign Up

Designing the Business Conversation Corpus

by   Matīss Rikters, et al.

While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems. In this paper, we aim to boost the machine translation quality of conversational texts by introducing a newly constructed Japanese-English business conversation parallel corpus. A detailed analysis of the corpus is provided along with challenging examples for automatic translation. We also experiment with adding the corpus in a machine translation training scenario and show how the resulting system benefits from its use.


page 1

page 2

page 3

page 4


MorisienMT: A Dataset for Mauritian Creole Machine Translation

In this paper, we describe MorisienMT, a dataset for benchmarking machin...

Polish - English Speech Statistical Machine Translation Systems for the IWSLT 2014

This research explores effects of various training settings between Poli...

Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus

Machine translation has been a major motivation of development in natura...

Automatic Parallel Corpus Creation for Hindi-English News Translation Task

The parallel corpus for multilingual NLP tasks, deep learning applicatio...

A Parallel Corpus of Translationese

We describe a set of bilingual English--French and English--German paral...

Tuned and GPU-accelerated parallel data mining from comparable corpora

The multilingual nature of the world makes translation a crucial require...