Automated Speaker Independent Visual Speech Recognition: A Comprehensive Survey

06/14/2023
by   Praneeth Nemani, et al.
0

Speaker-independent VSR is a complex task that involves identifying spoken words or phrases from video recordings of a speaker's facial movements. Over the years, there has been a considerable amount of research in the field of VSR involving different algorithms and datasets to evaluate system performance. These efforts have resulted in significant progress in developing effective VSR models, creating new opportunities for further research in this area. This survey provides a detailed examination of the progression of VSR over the past three decades, with a particular emphasis on the transition from speaker-dependent to speaker-independent systems. We also provide a comprehensive overview of the various datasets used in VSR research and the preprocessing techniques employed to achieve speaker independence. The survey covers the works published from 1990 to 2023, thoroughly analyzing each work and comparing them on various parameters. This survey provides an in-depth analysis of speaker-independent VSR systems evolution from 1990 to 2023. It outlines the development of VSR systems over time and highlights the need to develop end-to-end pipelines for speaker-independent VSR. The pictorial representation offers a clear and concise overview of the techniques used in speaker-independent VSR, thereby aiding in the comprehension and analysis of the various methodologies. The survey also highlights the strengths and limitations of each technique and provides insights into developing novel approaches for analyzing visual speech cues. Overall, This comprehensive review provides insights into the current state-of-the-art speaker-independent VSR and highlights potential areas for future research.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 8

page 10

research
01/24/2021

A Review of Speaker Diarization: Recent Advances with Deep Learning

Speaker diarization is a task to label audio or video recordings with cl...
research
10/24/2018

The speaker-independent lipreading play-off; a survey of lipreading machines

Lipreading is a difficult gesture classification task. One problem in co...
research
07/13/2020

SoK: The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

Speech and speaker recognition systems are employed in a variety of appl...
research
07/13/2020

The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

Speech and speaker recognition systems are employed in a variety of appl...
research
08/30/2023

From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications

Recent advancements in deep learning and computer vision have led to a s...
research
02/23/2022

State-of-the-art in speaker recognition

Recent advances in speech technologies have produced new tools that can ...
research
10/03/2017

Visual speech recognition: aligning terminologies for better understanding

We are at an exciting time for machine lipreading. Traditional research ...

Please sign up or login with your details

Forgot password? Click here to reset