Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation

04/13/2021
by   Hirofumi Inaguma, et al.
0

A conventional approach to improving the performance of end-to-end speech translation (E2E-ST) models is to leverage the source transcription via pre-training and joint training with automatic speech recognition (ASR) and neural machine translation (NMT) tasks. However, since the input modalities are different, it is difficult to leverage source language text successfully. In this work, we focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models. To leverage the full potential of the source language information, we propose backward SeqKD, SeqKD from a target-to-source backward NMT model. To this end, we train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder. The paraphrases are generated from the translations in bitext via back-translation. We further propose bidirectional SeqKD in which SeqKD from both forward and backward NMT models is combined. Experimental evaluations on both autoregressive and non-autoregressive models show that SeqKD in each direction consistently improves the translation performance, and the effectiveness is complementary regardless of the model capacity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2020

A Study of Non-autoregressive Model for Sequence Generation

Non-autoregressive (NAR) models generate all the tokens of a sequence in...
research
09/27/2021

Towards Reinforcement Learning for Pivot-based Neural Machine Translation with Non-autoregressive Transformer

Pivot-based neural machine translation (NMT) is commonly used in low-res...
research
07/17/2023

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts

End-to-end automatic speech translation (AST) relies on data that combin...
research
07/24/2021

The USYD-JD Speech Translation System for IWSLT 2021

This paper describes the University of Sydney JD's joint submission o...
research
05/12/2020

DiscreTalk: Text-to-Speech as a Machine Translation Problem

This paper proposes a new end-to-end text-to-speech (E2E-TTS) model base...
research
06/01/2020

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

Subtitling is becoming increasingly important for disseminating informat...
research
04/14/2021

The Curious Case of Hallucinations in Neural Machine Translation

In this work, we study hallucinations in Neural Machine Translation (NMT...

Please sign up or login with your details

Forgot password? Click here to reset