Lipreading using Temporal Convolutional Networks

01/23/2020
by   Brais Martinez, et al.
13

Lip-reading has attracted a lot of research attention lately thanks to advances in deep learning. The current state-of-the-art model for recognition of isolated words in-the-wild consists of a residual network and Bidirectional Gated Recurrent Unit (BGRU) layers. In this work, we address the limitations of this model and we propose changes which further improve its performance. Firstly, the BGRU layers are replaced with Temporal Convolutional Networks (TCN). Secondly, we greatly simplify the training procedure, which allows us to train the model in one single stage. Thirdly, we show that the current state-of-the-art methodology produces models that do not generalize well to variations on the sequence length, and we addresses this issue by proposing a variable-length augmentation. We present results on the largest publicly-available datasets for isolated word recognition in English and Mandarin, LRW and LRW1000, respectively. Our proposed model results in an absolute improvement of 1.2 the new state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2022

Training Strategies for Improved Lip-reading

Several training strategies and temporal models have been recently propo...
research
09/29/2020

Lip-reading with Densely Connected Temporal Convolutional Networks

In this work, we present the Densely Connected Temporal Convolutional Ne...
research
11/14/2019

Towards Pose-invariant Lip-Reading

Lip-reading models have been significantly improved recently thanks to p...
research
03/12/2017

Combining Residual Networks with LSTMs for Lipreading

We propose an end-to-end deep learning architecture for word-level visua...
research
08/04/2019

Deep Neural Network for Semantic-based Text Recognition in Images

State-of-the-art text spotting systems typically aim to detect isolated ...
research
08/27/2018

A neural attention model for speech command recognition

This paper introduces a convolutional recurrent network with attention f...
research
11/12/2013

Visualizing and Understanding Convolutional Networks

Large Convolutional Network models have recently demonstrated impressive...

Please sign up or login with your details

Forgot password? Click here to reset