Acoustic Landmarks Contain More Information About the Phone String than Other Frames

10/27/2017
by   Di He, et al.
0

Most mainstream Automatic Speech Recognition (ASR) systems consider all feature frames equally important. However, acoustic landmark theory is based on a contradictory idea, that some frames are more important than others. Acoustic landmark theory exploits the quantal nonlinear articulatory-acoustic relationships from human speech perception experiments, and provides theoretical support for extracting acoustic features in the vicinity of landmark regions where an abrupt change occurs in the spectrum of speech signals. In this work, we conduct experiments on the TIMIT corpus, with both GMM and DNN based ASR systems and found that frames containing landmarks are more informative than others. We found that altering the level of emphasis on landmarks through accordingly re-weighting acoustic likelihood in frames, tends to reduce the phone error rate (PER). Furthermore, by leveraging the landmark as a heuristic, one of our hybrid DNN frame dropping strategies maintained a PER within 0.44 of the frames. This hybrid strategy out-performs other non-heuristicbased methods and demonstrates the potential of landmarks for reducing computation.

READ FULL TEXT

page 6

page 22

research
11/05/2018

When CTC Training Meets Acoustic Landmarks

Connectionist temporal classification (CTC) training criterion provides ...
research
05/15/2018

Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks

Furui first demonstrated that the identity of both consonant and vowel c...
research
11/04/2020

Frustratingly Easy Noise-aware Training of Acoustic Models

Environmental noises and reverberation have a detrimental effect on the ...
research
05/05/2021

Accent Recognition with Hybrid Phonetic Features

The performance of voice-controlled systems is usually influenced by acc...
research
05/18/2020

Weak-Attention Suppression For Transformer Based Speech Recognition

Transformers, originally proposed for natural language processing (NLP) ...
research
07/28/2017

A weighting strategy for Active Shape Models

Active Shape Models (ASM) are an iterative segmentation technique to fin...

Please sign up or login with your details

Forgot password? Click here to reset