An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition

12/21/2018
by   Devesh Walawalkar, et al.
0

In this project, we worked on speech recognition, specifically predicting individual words based on both the video frames and audio. Empowered by convolutional neural networks, the recent speech recognition and lip reading models are comparable to human level performance. We re-implemented and made derivations of the state-of-the-art model. Then, we conducted rich experiments including the effectiveness of attention mechanism, more accurate residual network as the backbone with pre-trained weights and the sensitivity of our model with respect to audio input with/without noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2018

Resource aware design of a deep convolutional-recurrent neural network for speech recognition through audio-visual sensor fusion

Today's Automatic Speech Recognition systems only rely on acoustic signa...
research
09/05/2022

Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception

Audio and visual signals complement each other in human speech perceptio...
research
03/10/2018

Speech Recognition: Keyword Spotting Through Image Recognition

The problem of identifying voice commands has always been a challenge du...
research
03/22/2020

High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

Recently sequence-to-sequence models have started to achieve state-of-th...
research
02/26/2023

From Audio to Symbolic Encoding

Automatic music transcription (AMT) aims to convert raw audio to symboli...
research
07/10/2023

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Recent advances in deep neural networks have achieved unprecedented succ...
research
09/28/2018

Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture

Recent works in speech recognition rely either on connectionist temporal...

Please sign up or login with your details

Forgot password? Click here to reset