Knowledge Transfer for Efficient On-device False Trigger Mitigation

10/20/2020
by   Pranay Dighe, et al.
0

In this paper, we address the task of determining whether a given utterance is directed towards a voice-enabled smart-assistant device or not. An undirected utterance is termed as a "false trigger" and false trigger mitigation (FTM) is essential for designing a privacy-centric non-intrusive smart assistant. The directedness of an utterance can be identified by running automatic speech recognition (ASR) on it and determining the user intent by analyzing the ASR transcript. But in case of a false trigger, transcribing the audio using ASR itself is strongly undesirable. To alleviate this issue, we propose an LSTM-based FTM architecture which determines the user intent from acoustic features directly without explicitly generating ASR transcripts from the audio. The proposed models are small footprint and can be run on-device with limited computational resources. During training, the model parameters are optimized using a knowledge transfer approach where a more accurate self-attention graph neural network model serves as the teacher. Given the whole audio snippets, our approach mitigates 87 positive rate (TPR), and in a streaming audio scenario, the system listens to only 1.69s of the false trigger audio before rejecting it while achieving the same TPR.

READ FULL TEXT
research
01/25/2020

Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

Voice-triggered smart assistants often rely on detection of a trigger-ph...
research
05/14/2021

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation

We present a unified and hardware efficient architecture for two stage v...
research
10/21/2022

Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR

Accurate prediction of the user intent to interact with a voice assistan...
research
03/30/2022

Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models

We address the problem of detecting speech directed to a device that doe...
research
05/14/2021

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End

Comprehending the overall intent of an utterance helps a listener recogn...
research
07/17/2020

Streaming ResLSTM with Causal Mean Aggregation for Device-Directed Utterance Detection

In this paper, we propose a streaming model to distinguish voice queries...
research
08/07/2018

Device-directed Utterance Detection

In this work, we propose a classifier for distinguishing device-directed...

Please sign up or login with your details

Forgot password? Click here to reset