Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features

07/16/2019
by   Kunal Dhawan, et al.
0

We propose an algorithm to extract noise-robust acoustic features from noisy speech. We use Total Variability Modeling in combination with Non-negative Matrix Factorization (NMF) to learn a total variability subspace and adapt NMF dictionaries for each utterance. Unlike several other approaches for extracting noise-robust features, our algorithm does not require a training corpus of parallel clean and noisy speech. Furthermore, the proposed features are produced by an utterance-specific transform, allowing the features to be robust to the noise occurring in each utterance. Preliminary results on the Aurora 4 + DEMAND noise corpus show that our proposed features perform comparably to baseline acoustic features, including features calculated from a convolutive NMF (CNMF) model. Moreover, on unseen noises, our proposed features gives the most similar word error rate to clean speech compared to the baseline features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2015

Feature Normalisation for Robust Speech Recognition

Speech recognition system performance degrades in noisy environments. If...
research
11/16/2018

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization

In this paper we address speaker-independent multichannel speech enhance...
research
06/13/2023

Unsupervised speech enhancement with deep dynamical generative speech and noise models

This work builds on a previous work on unsupervised speech enhancement u...
research
01/19/2021

A sampling algorithm to compute the set of feasible solutions for non-negative matrix factorization with an arbitrary rank

Non-negative Matrix Factorization (NMF) is a useful method to extract fe...
research
06/13/2023

Evaluating Bias and Noise Induced by the U.S. Census Bureau's Privacy Protection Methods

The United States Census Bureau faces a difficult trade-off between the ...
research
03/27/2023

Partially Adaptive Multichannel Joint Reduction of Ego-noise and Environmental Noise

Human-robot interaction relies on a noise-robust audio processing module...
research
03/29/2022

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

Most text-to-speech (TTS) methods use high-quality speech corpora record...

Please sign up or login with your details

Forgot password? Click here to reset