Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

04/11/2022
by   Dianwen Ng, et al.
0

It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we present a multi-channel ConvMixer for speech command recognitions. The novel architecture introduces an additional audio channel mixing for channel audio interaction in a multi-channel audio setting to achieve better noise-robust features with more efficient computation. Besides, we proposed a centroid based awareness component to enhance the system by equipping it with additional spatial geometry information in the latent feature projection space. We evaluate our model using the new MISP challenge 2021 dataset. Our model achieves significant improvement against the official baseline with a 55 in the competition score (0.152) on raw microphone array input and a 63 (0.126) boost upon front-end speech enhancement.

READ FULL TEXT
research
06/20/2019

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Robustness against noise is critical for keyword spotting (KWS) in real-...
research
05/21/2023

DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting

Real-world complex acoustic environments especially the ones with a low ...
research
01/15/2022

ConvMixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-field Keyword Spotting

Building efficient architecture in neural speech processing is paramount...
research
05/07/2020

Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting

In this paper, we focus on the task of small-footprint keyword spotting ...
research
11/19/2022

Filterbank Learning for Small-Footprint Keyword Spotting Robust to Noise

In the context of keyword spotting (KWS), the replacement of handcrafted...
research
11/05/2019

Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions

Keyword Spotting (KWS) enables speech-based user interaction on smart de...
research
08/31/2023

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder

Using a vision-inspired keyword spotting framework, we propose an archit...

Please sign up or login with your details

Forgot password? Click here to reset