DeepAI AI Chat
Log In Sign Up

Towards end-to-end spoken language understanding

02/23/2018
by   Dmitriy Serdyuk, et al.
0

Spoken language understanding system is traditionally designed as a pipeline of a number of components. First, the audio signal is processed by an automatic speech recognizer for transcription or n-best hypotheses. With the recognition results, a natural language understanding system classifies the text to structured data as domain, intent and slots for down-streaming consumers, such as dialog system, hands-free applications. These components are usually developed and optimized independently. In this paper, we present our study on an end-to-end learning system for spoken language understanding. With this unified approach, we can infer the semantic meaning directly from audio features without the intermediate text representation. This study showed that the trained model can achieve reasonable good result and demonstrated that the model can capture the semantic attention directly from the audio features.

READ FULL TEXT
09/24/2018

From Audio to Semantics: Approaches to end-to-end spoken language understanding

Conventional spoken language understanding systems consist of two main c...
08/12/2020

End-to-End Neural Transformer Based Spoken Language Understanding

Spoken language understanding (SLU) refers to the process of inferring t...
10/06/2020

Textual Supervision for Visually Grounded Spoken Language Understanding

Visually-grounded models of spoken language understanding extract semant...
07/13/2017

Predicting Causes of Reformulation in Intelligent Assistants

Intelligent assistants (IAs) such as Siri and Cortana conversationally i...
02/14/2020

A Data Efficient End-To-End Spoken Language Understanding Architecture

End-to-end architectures have been recently proposed for spoken language...
04/08/2021

RNN Transducer Models For Spoken Language Understanding

We present a comprehensive study on building and adapting RNN transducer...