Cross-modality Data Augmentation for End-to-End Sign Language Translation

05/18/2023
by   Jinhui Ye, et al.
0

End-to-end sign language translation (SLT) aims to convert sign language videos into spoken language texts directly without intermediate representations. It has been a challenging task due to the modality gap between sign videos and texts and the data scarcity of labeled data. To tackle these challenges, we propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation (i.e. video-to-text) by exploiting pseudo gloss-text pairs from the sign gloss translation model. Specifically, XmDA consists of two key components, namely, cross-modality mix-up and cross-modality knowledge distillation. The former explicitly encourages the alignment between sign video features and gloss embeddings to bridge the modality gap. The latter utilizes the generation knowledge from gloss-to-text teacher models to guide the spoken language text generation. Experimental results on two widely used SLT datasets, i.e., PHOENIX-2014T and CSL-Daily, demonstrate that the proposed XmDA framework significantly and consistently outperforms the baseline models. Extensive analyses confirm our claim that XmDA enhances spoken language text generation by reducing the representation distance between videos and texts, as well as improving the processing of low-frequency words and long sentences.

READ FULL TEXT
research
10/13/2022

Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation

Sign language gloss translation aims to translate the sign glosses into ...
research
05/02/2023

SLTUNET: A Simple Unified Model for Sign Language Translation

Despite recent successes with neural models for sign language translatio...
research
10/11/2020

Boosting Continuous Sign Language Recognition via Cross Modality Augmentation

Continuous sign language recognition (SLR) deals with unaligned video-te...
research
05/26/2021

Improving Sign Language Translation with Monolingual Data by Sign Back-Translation

Despite existing pioneering works on sign language translation (SLT), th...
research
08/12/2022

Non-Autoregressive Sign Language Production via Knowledge Distillation

Sign Language Production (SLP) aims to translate expressions in spoken l...
research
12/08/2021

SimulSLT: End-to-End Simultaneous Sign Language Translation

Sign language translation as a kind of technology with profound social s...
research
06/24/2021

Towards Automatic Speech to Sign Language Generation

We aim to solve the highly challenging task of generating continuous sig...

Please sign up or login with your details

Forgot password? Click here to reset