A low latency ASR-free end to end spoken language understanding system

11/10/2020
by   Mohamed Mhiri, et al.
0

In recent years, developing a speech understanding system that classifies a waveform to structured data, such as intents and slots, without first transcribing the speech to text has emerged as an interesting research problem. This work proposes such as system with an additional constraint of designing a system that has a small enough footprint to run on small micro-controllers and embedded systems with minimal latency. Given a streaming input speech signal, the proposed system can process it segment-by-segment without the need to have the entire stream at the moment of processing. The proposed system is evaluated on the publicly available Fluent Speech Commands dataset. Experiments show that the proposed system yields state-of-the-art performance with the advantage of low latency and a much smaller model when compared to other published works on the same task.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

05/20/2021

A Streaming End-to-End Framework For Spoken Language Understanding

End-to-end spoken language understanding (SLU) has recently attracted in...
12/15/2020

Exploring Transfer Learning For End-to-End Spoken Language Understanding

Voice Assistants such as Alexa, Siri, and Google Assistant typically use...
07/20/2021

Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models

Non-autoregressive (NAR) modeling has gained more and more attention in ...
11/29/2021

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

As Automatic Speech Processing (ASR) systems are getting better, there i...
06/01/2020

Streaming Language Identification using Combination of Acoustic Representations and ASR Hypotheses

This paper presents our modeling and architecture approaches for buildin...
04/01/2022

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

End-to-end Spoken Language Understanding (E2E SLU) has attracted increas...
04/22/2022

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

Improving the performance of end-to-end ASR models on long utterances ra...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.