End-to-end speech-to-dialog-act recognition

04/23/2020
by   Viet-Trung Dang, et al.
0

Spoken language understanding, which extracts intents and/or semantic concepts in utterances, is conventionally formulated as a post-processing of automatic speech recognition. It is usually trained with oracle transcripts, but needs to deal with errors by ASR. Moreover, there are acoustic features which are related with intents but not represented with the transcripts. In this paper, we present an end-to-end model which directly converts speech into dialog acts without the deterministic transcription process. In the proposed model, the dialog act recognition network is conjunct with an acoustic-to-word ASR model at its latent layer before the softmax layer, which provides a distributed representation of word-level ASR decoding information. Then, the entire network is fine-tuned in an end-to-end manner. This allows for stable training as well as robustness against ASR errors. The model is further extended to conduct DA segmentation jointly. Evaluations with the Switchboard corpus demonstrate that the proposed method significantly improves dialog act recognition accuracy from the conventional pipeline framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2021

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems

As more users across the world are interacting with dialog agents in the...
research
03/31/2023

Dialog act guided contextual adapter for personalized speech recognition

Personalization in multi-turn dialogs has been a long standing challenge...
research
02/28/2019

Context-aware Neural-based Dialog Act Classification on Automatically Generated Transcriptions

This paper presents our latest investigations on dialog act (DA) classif...
research
04/11/2022

Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

Dialog history plays an important role in spoken language understanding ...
research
10/17/2018

Exploring Textual and Speech information in Dialogue Act Classification with Speaker Domain Adaptation

In spite of the recent success of Dialogue Act (DA) classification, the ...
research
02/14/2020

Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems

This work investigates the embeddings for representing dialog history in...
research
09/22/2017

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model

We apply sequence-to-sequence model to mitigate the impact of speech rec...

Please sign up or login with your details

Forgot password? Click here to reset