Speech Recognition Front End Without Information Loss

12/24/2013
by   Matthew Ager, et al.
0

Speech representation and modelling in high-dimensional spaces of acoustic waveforms, or a linear transformation thereof, is investigated with the aim of improving the robustness of automatic speech recognition to additive noise. The motivation behind this approach is twofold: (i) the information in acoustic waveforms that is usually removed in the process of extracting low-dimensional features might aid robust recognition by virtue of structured redundancy analogous to channel coding, (ii) linear feature domains allow for exact noise adaptation, as opposed to representations that involve non-linear processing which makes noise adaptation challenging. Thus, we develop a generative framework for phoneme modelling in high-dimensional linear feature domains, and use it in phoneme classification and recognition tasks. Results show that classification and recognition in this framework perform better than analogous PLP and MFCC classifiers below 18 dB SNR. A combination of the high-dimensional and MFCC features at the likelihood level performs uniformly better than either of the individual representations across all noise levels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2018

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

Automatic speech recognition can potentially benefit from the lip motion...
research
02/14/2017

On the Relevance of Auditory-Based Gabor Features for Deep Learning in Automatic Speech Recognition

Previous studies support the idea of merging auditory-based Gabor featur...
research
02/23/2021

Senone-aware Adversarial Multi-task Training for Unsupervised Child to Adult Speech Adaptation

Acoustic modeling for child speech is challenging due to the high acoust...
research
03/23/2018

An improved DNN-based spectral feature mapping that removes noise and reverberation for robust automatic speech recognition

Reverberation and additive noise have detrimental effects on the perform...
research
03/30/2022

Combination of Time-domain, Frequency-domain, and Cepstral-domain Acoustic Features for Speech Commands Classification

In speech-related classification tasks, frequency-domain acoustic featur...
research
10/25/2020

Probing Acoustic Representations for Phonetic Properties

Pre-trained acoustic representations such as wav2vec and DeCoAR have att...
research
05/02/2018

Information Loss in the Human Auditory System

From the eardrum to the auditory cortex, where acoustic stimuli are deco...

Please sign up or login with your details

Forgot password? Click here to reset