Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

05/05/2017
by   Ming Sun, et al.
0

We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields 67.6% relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2017

Improving LSTM-CTC based ASR performance in domains with limited training data

This paper addresses the observed performance gap between automatic spee...
research
06/15/2022

Latency Control for Keyword Spotting

Conversational agents commonly utilize keyword spotting (KWS) to initiat...
research
01/25/2020

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

We propose smoothed max pooling loss and its application to keyword spot...
research
11/02/2020

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

Deep Neural Network–Hidden Markov Model (DNN-HMM) based methods have bee...
research
06/02/2020

Cross entropy as objective function for music generative models

The election of the function to optimize when training a machine learnin...
research
07/17/2018

Guess who? Multilingual approach for the automated generation of author-stylized poetry

This paper addresses the problem of stylized text generation in a multil...
research
07/03/2018

Weakly Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation

A deep recurrent neural network with audio input is applied to model bas...

Please sign up or login with your details

Forgot password? Click here to reset