On Knowledge Distillation for Direct Speech Translation

12/09/2020
by   Marco Gaido, et al.
1

Direct speech translation (ST) has shown to be a complex task requiring knowledge transfer from its sub-tasks: automatic speech recognition (ASR) and machine translation (MT). For MT, one of the most promising techniques to transfer knowledge is knowledge distillation. In this paper, we compare the different solutions to distill knowledge in a sequence-to-sequence task like ST. Moreover, we analyze eventual drawbacks of this approach and how to alleviate them maintaining the benefits in terms of translation quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

Cascaded Models With Cyclic Feedback For Direct Speech Translation

Direct speech translation describes a scenario where only speech inputs ...
research
05/12/2021

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

Encoder pre-training is promising in end-to-end Speech Translation (ST),...
research
05/09/2023

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

State-of-the-art sequence-to-sequence models often require autoregressiv...
research
09/25/2019

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

In a pipeline speech translation system, automatic speech recognition (A...
research
04/05/2016

Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project

The paper steps outside the comfort-zone of the traditional NLP tasks li...
research
06/14/2023

EM-Network: Oracle Guided Self-distillation for Sequence Learning

We introduce EM-Network, a novel self-distillation approach that effecti...
research
08/23/2020

Learn to Talk via Proactive Knowledge Transfer

Knowledge Transfer has been applied in solving a wide variety of problem...

Please sign up or login with your details

Forgot password? Click here to reset