DeepAI AI Chat
Log In Sign Up

Natural Language-Assisted Sign Language Recognition

by   Ronglai Zuo, et al.
The Hong Kong University of Science and Technology

Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth. Due to the inherent restriction of combinations of these visual ingredients, there exist a significant number of visually indistinguishable signs (VISigns) in sign languages, which limits the recognition capacity of vision neural networks. To mitigate the problem, we propose the Natural Language-Assisted Sign Language Recognition (NLA-SLR) framework, which exploits semantic information contained in glosses (sign labels). First, for VISigns with similar semantic meanings, we propose language-aware label smoothing by generating soft labels for each training sign whose smoothing weights are computed from the normalized semantic similarities among the glosses to ease training. Second, for VISigns with distinct semantic meanings, we present an inter-modality mixup technique which blends vision and gloss features to further maximize the separability of different signs under the supervision of blended labels. Besides, we also introduce a novel backbone, video-keypoint network, which not only models both RGB videos and human body keypoints but also derives knowledge from sign videos of different temporal receptive fields. Empirically, our method achieves state-of-the-art performance on three widely-adopted benchmarks: MSASL, WLASL, and NMFs-CSL. Codes are available at


page 1

page 12

page 13

page 14

page 15


Two-Stream Network for Sign Language Recognition and Translation

Sign languages are visual languages using manual articulations and non-m...

Neural Sign Language Translation based on Human Keypoint Estimation

We propose a sign language translation system based on human keypoint es...

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

Sign language translation (SLT) aims to interpret sign video sequences i...

CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

This work focuses on sign language retrieval-a recently proposed task fo...

Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal

Most deep-learning-based continuous sign language recognition (CSLR) mod...

ZS-SLR: Zero-Shot Sign Language Recognition from RGB-D Videos

Sign Language Recognition (SLR) is a challenging research area in comput...

Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation

Sign language is the window for people differently-abled to express thei...