DeepAI AI Chat
Log In Sign Up

Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

by   Mattia Antonino Di Gangi, et al.
Fondazione Bruno Kessler
Università di Trento

This paper describes FBK's submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in general much higher than machine translation input. Our model was trained only on the audio-to-text parallel data released for the task, and fine-tuned on cleaned subsets of the original training corpus. The addition of weight normalization and label smoothing improved the baseline system by 1.0 BLEU point on our validation set. The final submission also featured checkpoint averaging within a training run and ensemble decoding of models trained during multiple runs. On test data, our best single model obtained a BLEU score of 9.7, while the ensemble obtained a BLEU score of 10.24.


page 1

page 2

page 3

page 4


UPC's Speech Translation System for IWSLT 2021

This paper describes the submission to the IWSLT 2021 offline speech tra...

The MeMAD Submission to the IWSLT 2018 Speech Translation Task

This paper describes the MeMAD project entry to the IWSLT Speech Transla...

The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task

This paper describes the submission of the NiuTrans end-to-end speech tr...

End-to-end Speech Translation via Cross-modal Progressive Training

End-to-end speech translation models have become a new trend in the rese...

The University of Sydney's Machine Translation System for WMT19

This paper describes the University of Sydney's submission of the WMT 20...

CTC-based Compression for Direct Speech Translation

Previous studies demonstrated that a dynamic phone-informed compression ...