Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task

10/08/2022
by   Cong Ma, et al.
0

End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research. However, data sparsity limits the performance of end-to-end text image translation. Multi-task learning is a non-trivial way to alleviate this problem via exploring knowledge from complementary related tasks. In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. By sharing model parameters and multi-task training, our model is able to take full advantage of easily-available large-scale text parallel corpus. Extensive experimental results show our proposed method outperforms existing end-to-end methods, and the joint multi-task learning with both text translation and recognition tasks achieves better results, proving translation and recognition auxiliary tasks are complementary.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2023

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

Text image machine translation (TIMT) has been widely used in various re...
research
12/16/2019

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

Speech-to-text translation (ST), which translates source language speech...
research
11/07/2018

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks

The training of many existing end-to-end steering angle prediction model...
research
04/15/2019

Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation

Speech translation has traditionally been approached through cascaded mo...
research
05/08/2022

Scheduled Multi-task Learning for Neural Chat Translation

Neural Chat Translation (NCT) aims to translate conversational text into...
research
03/29/2019

Attention-Augmented End-to-End Multi-Task Learning for Emotion Prediction from Speech

Despite the increasing research interest in end-to-end learning systems ...
research
08/29/2019

DeepDistance: A Multi-task Deep Regression Model for Cell Detection in Inverted Microscopy Images

This paper presents a new deep regression model, which we call DeepDista...

Please sign up or login with your details

Forgot password? Click here to reset