Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks

12/19/2018
by   Raphael Tang, et al.
0

Voice-enabled commercial products are ubiquitous, typically enabled by lightweight on-device keyword spotting (KWS) and full automatic speech recognition (ASR) in the cloud. ASR systems require significant computational resources in training and for inference, not to mention copious amounts of annotated speech data. KWS systems, on the other hand, are less resource-intensive but have limited capabilities. On the Comcast Xfinity X1 entertainment platform, we explore a middle ground between ASR and KWS: We introduce a novel, resource-efficient neural network for voice query recognition that is much more accurate than state-of-the-art CNNs for KWS, yet can be easily trained and deployed with limited resources. On an evaluation dataset representing the top 200 voice queries, we achieve a low false alarm rate of 1 faster than the current ASR system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking

The development of deep learning technology has greatly promoted the per...
research
11/21/2022

SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale

End-to-end automatic speech recognition systems represent the state of t...
research
06/18/2021

On-Device Personalization of Automatic Speech Recognition Models for Disordered Speech

While current state-of-the-art Automatic Speech Recognition (ASR) system...
research
05/29/2023

Building Accurate Low Latency ASR for Streaming Voice Search

Automatic Speech Recognition (ASR) plays a crucial role in voice-based a...
research
02/08/2022

Enhancing ASR for Stuttered Speech with Limited Data Using Detect and Pass

It is estimated that around 70 million people worldwide are affected by ...
research
06/15/2023

MobileASR: A resource-aware on-device personalisation framework for automatic speech recognition in mobile phones

We describe a comprehensive methodology for developing user-voice person...
research
09/04/2023

AVATAR: Robust Voice Search Engine Leveraging Autoregressive Document Retrieval and Contrastive Learning

Voice, as input, has progressively become popular on mobiles and seems t...

Please sign up or login with your details

Forgot password? Click here to reset