Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting

05/07/2020
by   Haiwei Wu, et al.
0

In this paper, we focus on the task of small-footprint keyword spotting under the far-field scenario. Far-field environments are commonly encountered in real-life speech applications, and it causes serve degradation of performance due to room reverberation and various kinds of noises. Our baseline system is built on the convolutional neural network trained with pooled data of both far-field and close-talking speech. To cope with the distortions, we adopt the multi-task learning scheme with alignment loss to reduce the mismatch between the embedding features learned from different domains of data. Experimental results show that our proposed method maintains the performance on close-talking speech and achieves significant improvement on the far-field test set.

READ FULL TEXT
research
11/03/2020

Training Wake Word Detection with Synthesized Speech Data on Confusion Words

Confusing-words are commonly encountered in real-life keyword spotting a...
research
04/11/2022

Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

It is critical for a keyword spotting model to have a small footprint as...
research
05/30/2020

Exploring Filterbank Learning for Keyword Spotting

Despite their great performance over the years, handcrafted speech featu...
research
06/04/2020

A study on more realistic room simulation for far-field keyword spotting

We investigate the impact of more realistic room simulation for training...
research
07/15/2021

Multi-task Learning with Cross Attention for Keyword Spotting

Keyword spotting (KWS) is an important technique for speech applications...
research
08/31/2023

Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data

Few-shot keyword spotting (FS-KWS) models usually require large-scale an...
research
01/15/2022

ConvMixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-field Keyword Spotting

Building efficient architecture in neural speech processing is paramount...

Please sign up or login with your details

Forgot password? Click here to reset