Latency Control for Keyword Spotting

06/15/2022
by   Christin Jose, et al.
0

Conversational agents commonly utilize keyword spotting (KWS) to initiate voice interaction with the user. For user experience and privacy considerations, existing approaches to KWS largely focus on accuracy, which can often come at the expense of introduced latency. To address this tradeoff, we propose a novel approach to control KWS model latency and which generalizes to any loss function without explicit knowledge of the keyword endpoint. Through a single, tunable hyperparameter, our approach enables one to balance detection latency and accuracy for the targeted application. Empirically, we show that our approach gives superior performance under latency constraints when compared to existing methods. Namely, we make a substantial 25% relative false accepts improvement for a fixed latency target when compared to the baseline state-of-the-art. We also show that when our approach is used in conjunction with a max-pooling loss, we are able to improve relative false accepts by 25 at a fixed latency when compared to cross entropy loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2017

Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

We propose a max-pooling based loss function for training Long Short-Ter...
research
01/25/2020

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

We propose smoothed max pooling loss and its application to keyword spot...
research
11/02/2020

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

Deep Neural Network–Hidden Markov Model (DNN-HMM) based methods have bee...
research
04/18/2022

AB/BA analysis: A framework for estimating keyword spotting recall improvement while maintaining audio privacy

Evaluation of keyword spotting (KWS) systems that detect keywords in spe...
research
10/30/2018

JavaScript Convolutional Neural Networks for Keyword Spotting in the Browser: An Experimental Analysis

Used for simple commands recognition on devices from smart routers to mo...
research
03/30/2022

Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models

We address the problem of detecting speech directed to a device that doe...
research
10/29/2020

Progressive Voice Trigger Detection: Accuracy vs Latency

We present an architecture for voice trigger detection for virtual assis...

Please sign up or login with your details

Forgot password? Click here to reset