Attention-Driven Multi-Modal Fusion: Enhancing Sign Language Recognition and Translation

09/04/2023
by   Zaber Ibn Abdul Hakim, et al.
0

In this paper, we devise a mechanism for the addition of multi-modal information with an existing pipeline for continuous sign language recognition and translation. In our procedure, we have incorporated optical flow information with RGB images to enrich the features with movement-related information. This work studies the feasibility of such modality inclusion using a cross-modal encoder. The plugin we have used is very lightweight and doesn't need to include a separate feature extractor for the new modality in an end-to-end manner. We have applied the changes in both sign language recognition and translation, improving the result in each case. We have evaluated the performance on the RWTH-PHOENIX-2014 dataset for sign language recognition and the RWTH-PHOENIX-2014T dataset for translation. On the recognition task, our approach reduced the WER by 0.9, and on the translation task, our approach increased most of the BLEU scores by  0.6 on the test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2020

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

Prior work on Sign Language Translation has shown that having a mid-leve...
research
09/22/2020

Visual Methods for Sign Language Recognition: A Modality-Based Review

Sign language visual recognition from continuous multi-modal streams is ...
research
04/28/2019

Translate-to-Recognize Networks for RGB-D Scene Recognition

Cross-modal transfer is helpful to enhance modality-specific discriminat...
research
04/10/2020

ASL Recognition with Metric-Learning based Lightweight Network

In the past decades the set of human tasks that are solved by machines w...
research
08/18/2023

Is context all you need? Scaling Neural Sign Language Translation to Large Domains of Discourse

Sign Language Translation (SLT) is a challenging task that aims to gener...
research
12/06/2022

SignNet: Single Channel Sign Generation using Metric Embedded Learning

A true interpreting agent not only understands sign language and transla...
research
12/08/2021

SimulSLT: End-to-End Simultaneous Sign Language Translation

Sign language translation as a kind of technology with profound social s...

Please sign up or login with your details

Forgot password? Click here to reset