DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting

05/21/2023
by   Shubo Lv, et al.
0

Real-world complex acoustic environments especially the ones with a low signal-to-noise ratio (SNR) will bring tremendous challenges to a keyword spotting (KWS) system. Inspired by the recent advances of neural speech enhancement and context bias in speech recognition, we propose a robust audio context bias based DCCRN-KWS model to address this challenge. We form the whole architecture as a multi-task learning framework for both denosing and keyword spotting, where the DCCRN encoder is connected with the KWS model. Helped with the denoising task, we further introduce an audio context bias module to leverage the real keyword samples and bias the network to better iscriminate keywords in noisy conditions. Feature merge and complex context linear modules are also introduced to strength such discrimination and to effectively leverage contextual information respectively. Experiments on the internal challenging dataset and the HIMIYA public dataset show that our DCCRN-KWS system is superior in performance, while ablation study demonstrates the good design of the whole model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2022

Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

It is critical for a keyword spotting model to have a small footprint as...
research
10/23/2020

Speech enhancement aided end-to-end multi-task learning for voice activity detection

Robust voice activity detection (VAD) is a challenging task in low signa...
research
08/31/2023

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder

Using a vision-inspired keyword spotting framework, we propose an archit...
research
06/20/2019

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Robustness against noise is critical for keyword spotting (KWS) in real-...
research
07/04/2022

CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer

Customized keyword spotting (KWS) has great potential to be deployed on ...
research
11/20/2021

Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection

In many speech-enabled human-machine interaction scenarios, user speech ...
research
11/19/2022

Filterbank Learning for Small-Footprint Keyword Spotting Robust to Noise

In the context of keyword spotting (KWS), the replacement of handcrafted...

Please sign up or login with your details

Forgot password? Click here to reset