Improving Label-Deficient Keyword Spotting Using Self-Supervised Pretraining

10/04/2022
by   Holger Severin Bovbjerg, et al.
0

In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS technology being embedded in a number of technologies such as voice assistants. Many of these models rely on large amounts of labelled data to achieve good performance. As a result, their use is restricted to applications for which a large labelled speech data set can be obtained. Self-supervised learning seeks to mitigate the need for large labelled data sets by leveraging unlabelled data, which is easier to obtain in large amounts. However, most self-supervised methods have only been investigated for very large models, whereas KWS models are desired to be small. In this paper, we investigate the use of self-supervised pretraining for the smaller KWS models in a label-deficient scenario. We pretrain the Keyword Transformer model using the self-supervised framework Data2Vec and carry out experiments on a label-deficient setup of the Google Speech Commands data set. It is found that the pretrained models greatly outperform the models without pretraining, showing that Data2Vec pretraining can increase the performance of KWS models in label-deficient scenarios. The source code is made publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition

We investigate the performance of self-supervised pretraining frameworks...
research
09/26/2022

Self-supervised similarity models based on well-logging data

Adopting data-based approaches leads to model improvement in numerous Oi...
research
09/05/2023

A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images

Self-supervised pretraining has been observed to be effective at improvi...
research
02/03/2023

SPADE: Self-supervised Pretraining for Acoustic DisEntanglement

Self-supervised representation learning approaches have grown in popular...
research
11/30/2021

MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

Self-supervised pretraining is the method of choice for natural language...
research
10/10/2022

Exploiting map information for self-supervised learning in motion forecasting

Inspired by recent developments regarding the application of self-superv...
research
03/07/2023

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

Self-supervised speech representation learning (S3RL) is revolutionizing...

Please sign up or login with your details

Forgot password? Click here to reset