AppTek's Submission to the IWSLT 2022 Isometric Spoken Language Translation Task

05/12/2022
by   Patrick Wilken, et al.
2

To participate in the Isometric Spoken Language Translation Task of the IWSLT 2022 evaluation, constrained condition, AppTek developed neural Transformer-based systems for English-to-German with various mechanisms of length control, ranging from source-side and target-side pseudo-tokens to encoding of remaining length in characters that replaces positional encoding. We further increased translation length compliance by sentence-level selection of length-compliant hypotheses from different system variants, as well as rescoring of N-best candidates from a single system. Length-compliant back-translated and forward-translated synthetic data, as well as other parallel data variants derived from the original MuST-C training corpus were important for a good quality/desired length trade-off. Our experimental results show that length compliance levels above 90 losses in MT quality as measured in BERT and BLEU scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2017

Synthetic Data for Neural Machine Translation of Spoken-Dialects

In this paper, we introduce a novel approach to generate synthetic data ...
research
12/16/2021

Isometric MT: Neural Machine Translation for Automatic Dubbing

Automatic dubbing (AD) is among the use cases where translations should ...
research
08/05/2021

WeChat Neural Machine Translation Systems for WMT21

This paper introduces WeChat AI's participation in WMT 2021 shared news ...
research
10/15/2020

Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings

We describe an unsupervised method to create pseudo-parallel corpora for...
research
12/20/2022

Original or Translated? On the Use of Parallel Data for Translation Quality Estimation

Machine Translation Quality Estimation (QE) is the task of evaluating tr...
research
06/03/2020

Multi-Agent Cross-Translated Diversification for Unsupervised Machine Translation

Recent unsupervised machine translation (UMT) systems usually employ thr...
research
10/18/2022

Simultaneous Translation for Unsegmented Input: A Sliding Window Approach

In the cascaded approach to spoken language translation (SLT), the ASR o...

Please sign up or login with your details

Forgot password? Click here to reset