Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

01/25/2020
by   Pranay Dighe, et al.
0

Voice-triggered smart assistants often rely on detection of a trigger-phrase before they start listening for the user request. Mitigation of false triggers is an important aspect of building a privacy-centric non-intrusive smart assistant. In this paper, we address the task of false trigger mitigation (FTM) using a novel approach based on analyzing automatic speech recognition (ASR) lattices using graph neural networks (GNN). The proposed approach uses the fact that decoding lattice of a falsely triggered audio exhibits uncertainties in terms of many alternative paths and unexpected words on the lattice arcs as compared to the lattice of a correctly triggered audio. A pure trigger-phrase detector model doesn't fully utilize the intent of the user speech whereas by using the complete decoding lattice of user audio, we can effectively mitigate speech not intended for the smart assistant. We deploy two variants of GNNs in this paper based on 1) graph convolution layers and 2) self-attention mechanism respectively. Our experiments demonstrate that GNNs are highly accurate in FTM task by mitigating  87 Furthermore, the proposed models are fast to train and efficient in parameter requirements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2020

Knowledge Transfer for Efficient On-device False Trigger Mitigation

In this paper, we address the task of determining whether a given uttera...
research
02/29/2020

Voice trigger detection from LVCSR hypothesis lattices using bidirectional lattice recurrent neural networks

We propose a method to reduce false voice triggers of a speech-enabled p...
research
05/14/2021

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation

We present a unified and hardware efficient architecture for two stage v...
research
06/13/2019

Lattice Transformer for Speech Translation

Recent advances in sequence modeling have highlighted the strengths of t...
research
07/02/2022

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition

Incorporating biasing words obtained as contextual knowledge is critical...
research
08/18/2020

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

False triggers in voice assistants are unintended invocations of the ass...
research
09/21/2021

FakeWake: Understanding and Mitigating Fake Wake-up Words of Voice Assistants

In the area of Internet of Things (IoT) voice assistants have become an ...

Please sign up or login with your details

Forgot password? Click here to reset