Simplified Hierarchical Recurrent Encoder-Decoder for Building End-To-End Dialogue Systems

09/08/2018
by   Chao Wang, et al.
0

As a generative model for building end-to-end dialogue systems, Hierarchical Recurrent Encoder-Decoder (HRED) consists of three layers of Gated Recurrent Unit (GRU), which from bottom to top are separately used as the word-level encoder, the sentence-level encoder, and the decoder. Despite performing well on dialogue corpora, HRED is computationally expensive to train due to its complexity. To improve the training efficiency of HRED, we propose a new model, which is named as Simplified HRED (SHRED), by making each layer of HRED except the top one simpler than its upper layer. On the one hand, we propose Scalar Gated Unit (SGU), which is a simplified variant of GRU, and use it as the sentence-level encoder. On the other hand, we use Fixed-size Ordinally-Forgetting Encoding (FOFE), which has no trainable parameter at all, as the word-level encoder. The experimental results show that compared with HRED under the same word embedding size and the same hidden state size for each layer, SHRED reduces the number of trainable parameters by 25%--35%, and the training time by more than 50%, but still achieves slightly better performance.

READ FULL TEXT
research
07/17/2015

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

We investigate the task of building open domain, conversational dialogue...
research
04/16/2016

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction

We demonstrate that an attention-based encoder-decoder model can be used...
research
04/24/2017

Selective Encoding for Abstractive Sentence Summarization

We propose a selective encoding model to extend the sequence-to-sequence...
research
07/25/2021

Learn to Focus: Hierarchical Dynamic Copy Network for Dialogue State Tracking

Recently, researchers have explored using the encoder-decoder framework ...
research
07/04/2016

Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization

In this work, we introduce temporal hierarchies to the sequence to seque...
research
08/08/2018

Natural Language Generation by Hierarchical Decoding with Linguistic Patterns

Natural language generation (NLG) is a critical component in spoken dial...
research
11/27/2018

Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation

As mixed reality is gaining increased momentum, the development of effec...

Please sign up or login with your details

Forgot password? Click here to reset