Temporal superimposed crossover module for effective continuous sign language

11/07/2022
by   Qidan Zhu, et al.
0

The ultimate goal of continuous sign language recognition(CSLR) is to facilitate the communication between special people and normal people, which requires a certain degree of real-time and deploy-ability of the model. However, in the previous research on CSLR, little attention has been paid to the real-time and deploy-ability. In order to improve the real-time and deploy-ability of the model, this paper proposes a zero parameter, zero computation temporal superposition crossover module(TSCM), and combines it with 2D convolution to form a "TSCM+2D convolution" hybrid convolution, which enables 2D convolution to have strong spatial-temporal modelling capability with zero parameter increase and lower deployment cost compared with other spatial-temporal convolutions. The overall CSLR model based on TSCM is built on the improved ResBlockT network in this paper. The hybrid convolution of "TSCM+2D convolution" is applied to the ResBlock of the ResNet network to form the new ResBlockT, and random gradient stop and multi-level CTC loss are introduced to train the model, which reduces the final recognition WER while reducing the training memory usage, and extends the ResNet network from image classification task to video recognition task. In addition, this study is the first in CSLR to use only 2D convolution extraction of sign language video temporal-spatial features for end-to-end learning for recognition. Experiments on two large-scale continuous sign language datasets demonstrate the effectiveness of the proposed method and achieve highly competitive results.

READ FULL TEXT
research
04/08/2022

Multi-scale temporal network for continuous sign language recognition

Continuous Sign Language Recognition (CSLR) is a challenging research ta...
research
04/19/2022

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Sign language is a beautiful visual language and is also the primary lan...
research
07/03/2022

Continuous Sign Language Recognition via Temporal Super-Resolution Network

Aiming at the problem that the spatial-temporal hierarchical continuous ...
research
02/08/2020

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition

Despite the recent success of deep learning in continuous sign language ...
research
01/31/2019

Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition

The recognition of sign language is a challenging task with an important...
research
08/14/2023

Orthogonal Temporal Interpolation for Zero-Shot Video Recognition

Zero-shot video recognition (ZSVR) is a task that aims to recognize vide...
research
07/27/2021

Multi-Scale Local-Temporal Similarity Fusion for Continuous Sign Language Recognition

Continuous sign language recognition (cSLR) is a public significant task...

Please sign up or login with your details

Forgot password? Click here to reset