Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition

10/09/2017
by   Bowen Shi, et al.
0

We address the problem of automatic American Sign Language fingerspelling recognition from video. Prior work has largely relied on frame-level labels, hand-crafted features, or other constraints, and has been hampered by the scarcity of data for this task. We introduce a model for fingerspelling recognition that addresses these issues. The model consists of an auto-encoder-based feature extractor and an attention-based neural encoder-decoder, which are trained jointly. The model receives a sequence of image frames and outputs the fingerspelled word, without relying on any frame-level training labels or hand-crafted features. In addition, the auto-encoder subcomponent makes it possible to leverage unlabeled data to improve the feature learning. The model achieves 11.6 accuracy improvement respectively in signer-independent and signer- adapted fingerspelling recognition over previous approaches that required frame-level training labels.

READ FULL TEXT
research
08/28/2018

Evaluating the Utility of Hand-crafted Features in Sequence Labelling

Conventional wisdom is that hand-crafted features are redundant for deep...
research
10/26/2018

American Sign Language fingerspelling recognition in the wild

We address the problem of American Sign Language fingerspelling recognit...
research
03/21/2023

Self-Sufficient Framework for Continuous Sign Language Recognition

The goal of this work is to develop self-sufficient framework for Contin...
research
06/03/2019

A Semi-Supervised Approach for Low-Resourced Text Generation

Recently, encoder-decoder neural models have achieved great success on t...
research
04/09/2015

Unsupervised Feature Learning from Temporal Data

Current state-of-the-art classification and detection algorithms rely on...
research
12/18/2014

Unsupervised Learning of Spatiotemporally Coherent Metrics

Current state-of-the-art classification and detection algorithms rely on...
research
07/31/2019

Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition

Video Recognition has drawn great research interest and great progress h...

Please sign up or login with your details

Forgot password? Click here to reset