Towards Fully Automated Manga Translation

12/28/2020
by   Ryota Hinami, et al.
60

We tackle the problem of machine translation of manga, Japanese comics. Manga translation involves two important problems in machine translation: context-aware and multimodal translation. Since text and images are mixed up in an unstructured fashion in Manga, obtaining context from the image is essential for manga translation. However, it is still an open problem how to extract context from image and integrate into MT models. In addition, corpus and benchmarks to train and evaluate such model is currently unavailable. In this paper, we make the following four contributions that establishes the foundation of manga translation research. First, we propose multimodal context-aware translation framework. We are the first to incorporate context information obtained from manga image. It enables us to translate texts in speech bubbles that cannot be translated without using context information (e.g., texts in other speech bubbles, gender of speakers, etc.). Second, for training the model, we propose the approach to automatic corpus construction from pairs of original manga and their translations, by which large parallel corpus can be constructed without any manual labeling. Third, we created a new benchmark to evaluate manga translation. Finally, on top of our proposed methods, we devised a first comprehensive system for fully automated manga translation.

READ FULL TEXT

page 3

page 5

page 12

page 15

page 16

page 17

page 18

page 19

research
10/19/2020

Diving Deep into Context-Aware Neural Machine Translation

Context-aware neural machine translation (NMT) is a promising direction ...
research
05/07/2021

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Recent work in neural machine translation has demonstrated both the nece...
research
12/20/2022

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

Multimodal machine translation (MMT) aims to improve translation quality...
research
05/11/2021

Can You Traducir This? Machine Translation for Code-Switched Input

Code-Switching (CSW) is a common phenomenon that occurs in multilingual ...
research
08/05/2020

Designing the Business Conversation Corpus

While the progress of machine translation of written text has come far i...
research
07/07/2021

Time-Aware Ancient Chinese Text Translation and Inference

In this paper, we aim to address the challenges surrounding the translat...

Please sign up or login with your details

Forgot password? Click here to reset