Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring

06/25/2018
by   Raghav Menon, et al.
0

We use dynamic time warping (DTW) as supervision for training a convolutional neural network (CNN) based keyword spotting system using a small set of spoken isolated keywords. The aim is to allow rapid deployment of a keyword spotting system in a new language to support urgent United Nations (UN) relief programmes in parts of Africa where languages are extremely under-resourced and the development of annotated speech resources is infeasible. First, we use 1920 recorded keywords (40 keyword types, 34 minutes of speech) as exemplars in a DTW-based template matching system and apply it to untranscribed broadcast speech. Then, we use the resulting DTW scores as targets to train a CNN on the same unlabelled speech. In this way we use just 34 minutes of labelled speech, but leverage a large amount of unlabelled data for training. While the resulting CNN keyword spotter cannot match the performance of the DTW-based system, it substantially outperforms a CNN classifier trained only on the keywords, improving the area under the ROC curve from 0.54 to 0.64. Because our CNN system is several orders of magnitude faster at runtime than the DTW system, it represents the most viable keyword spotter on this extremely limited dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2018

ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource languages

We consider multilingual bottleneck features (BNFs) for nearly zero-reso...
research
11/14/2018

Almost Zero-Resource ASR-free Keyword Spotting using Multilingual Bottleneck Features and Correspondence Autoencoders

We compare features for dynamic time warping based keyword spotting in a...
research
06/01/2023

Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili

We consider hate speech detection through keyword spotting on radio broa...
research
01/31/2020

Training Keyword Spotters with Limited and Synthesized Speech Data

With the rise of low power speech-enabled devices, there is a growing de...
research
05/18/2023

TempAdaCos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping

Few-shot keyword spotting (KWS) systems often utilize a sliding window o...
research
08/04/2023

N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets

Accurate transcription of proper names and technical terms is particular...
research
10/05/2019

Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data

In societies with well developed internet infrastructure, social media i...

Please sign up or login with your details

Forgot password? Click here to reset