Some observations on computer lip-reading: moving from the dream to the reality

10/03/2017
by   Helen L Bear, et al.
0

In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called visemes for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system.

READ FULL TEXT

page 3

page 4

research
10/03/2017

Resolution limits on visual speech recognition

Visual-only speech recognition is dependent upon a number of factors tha...
research
10/03/2017

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

A critical assumption of all current visual speech recognition systems i...
research
09/03/2014

Visual Speech Recognition

Lip reading is used to understand or interpret speech without hearing it...
research
01/19/2013

Lip Localization and Viseme Classification for Visual Speech Recognition

The need for an automatic lip-reading system is ever increasing. Infact,...
research
09/17/2014

Visual Words for Automatic Lip-Reading

Lip reading is used to understand or interpret speech without hearing it...
research
05/17/2019

Plotting Markson's 'Mistress'

The post-modern novel 'Wittgenstein's Mistress' by David Markson (1988) ...
research
01/14/2022

The Mathematics of Comparing Objects

`After reading two different crime stories, an artificial intelligence c...

Please sign up or login with your details

Forgot password? Click here to reset