Wake Word Detection with Streaming Transformers

by   Yiming Wang, et al.

Modern wake word detection systems usually rely on neural networks for acoustic modeling. Transformers has recently shown superior performance over LSTM and convolutional networks in various sequence modeling tasks with their better temporal modeling power. However it is not clear whether this advantage still holds for short-range temporal modeling like wake word detection. Besides, the vanilla Transformer is not directly applicable to the task due to its non-streaming nature and the quadratic time and space complexity. In this paper we explore the performance of several variants of chunk-wise streaming Transformers tailored for wake word detection in a recently proposed LF-MMI system, including looking-ahead to the next chunk, gradient stopping, different positional embedding methods and adding same-layer dependency between chunks. Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate with a comparable model size, while still maintaining linear complexity w.r.t. the sequence length.



There are no comments yet.


page 1

page 2

page 3

page 4


On Learning the Transformer Kernel

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, d...

Glance-and-Gaze Vision Transformer

Recently, there emerges a series of vision Transformers, which show supe...

Exploring Transformers for Large-Scale Speech Recognition

While recurrent neural networks still largely define state-of-the-art sp...

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer networks have a potential of learning longer-term dependency...

Streaming Transformer ASR with Blockwise Synchronous Inference

The Transformer self-attention network has recently shown promising perf...

On the Computational Power of Transformers and Its Implications in Sequence Modeling

Transformers are being used extensively across several sequence modeling...

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

Recently, many datasets have been proposed to test the systematic genera...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.