Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

by   Vishal Sunder, et al.

Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a hierarchical conversation model that is capable of directly using dialog history in speech form, making it fully E2E. We also distill semantic knowledge from the available gold conversation transcripts by jointly training a similar text-based conversation model with an explicit tying of acoustic and semantic embeddings. We also propose a novel technique that we call DropFrame to deal with the long training time incurred by adding dialog history in an E2E manner. On the HarperValleyBank dialog dataset, our E2E history integration outperforms a history independent baseline by 7.7 task of dialog action recognition. Our model performs competitively with the state-of-the-art history based cascaded baseline, but uses 48 parameters. In the absence of gold transcripts to fine-tune an ASR model, our model outperforms this baseline by a significant margin of 10 score.


page 1

page 2

page 3

page 4


Integrating Dialog History into End-to-End Spoken Language Understanding Systems

End-to-end spoken language understanding (SLU) systems that process huma...

End-to-end speech-to-dialog-act recognition

Spoken language understanding, which extracts intents and/or semantic co...

Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History

Most human interactions occur in the form of spoken conversations where ...

Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning

Task-oriented dialog presents a difficult challenge encompassing multipl...

Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems

This work investigates the embeddings for representing dialog history in...

Are Neural Open-Domain Dialog Systems Robust to Speech Recognition Errors in the Dialog History? An Empirical Study

Large end-to-end neural open-domain chatbots are becoming increasingly p...

Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History

We study non-collaborative dialogs, where two agents have a conflict of ...

Please sign up or login with your details

Forgot password? Click here to reset