Filter-based Discriminative Autoencoders for Children Speech Recognition

04/01/2022
by   Chiang-Lin Tai, et al.
0

Children speech recognition is indispensable but challenging due to the diversity of children's speech. In this paper, we propose a filter-based discriminative autoencoder for acoustic modeling. To filter out the influence of various speaker types and pitches, auxiliary information of the speaker and pitch features is input into the encoder together with the acoustic features to generate phonetic embeddings. In the training phase, the decoder uses the auxiliary information and the phonetic embedding extracted by the encoder to reconstruct the input acoustic features. The autoencoder is trained by simultaneously minimizing the ASR loss and feature reconstruction error. The framework can make the phonetic embedding purer, resulting in more accurate senone (triphone-state) scores. Evaluated on the test set of the CMU Kids corpus, our system achieves a 7.8 baseline system. In the domain adaptation experiment, our system also outperforms the baseline system on the British-accent PF-STAR task.

READ FULL TEXT
research
10/30/2021

Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition

This study addresses the problem of single-channel Automatic Speech Reco...
research
06/16/2022

Nonwords Pronunciation Classification in Language Development Tests for Preschool Children

This work aims to automatically evaluate whether the language developmen...
research
09/23/2021

Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps CHiME-4 Corpora

In this study, we propose to investigate triplet loss for the purpose of...
research
01/24/2022

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

Dysarthric speech recognition is a challenging task due to acoustic vari...
research
11/25/2020

SAR-Net: A End-to-End Deep Speech Accent Recognition Network

This paper proposes a end-to-end deep network to recognize kinds of acce...
research
03/25/2022

Chain-based Discriminative Autoencoders for Speech Recognition

In our previous work, we proposed a discriminative autoencoder (DcAE) fo...
research
05/06/2022

A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy

Acoustic Echo Cancellation (AEC) is essential for accurate recognition o...

Please sign up or login with your details

Forgot password? Click here to reset