A Deep Memory-based Architecture for Sequence-to-Sequence Learning

06/22/2015
by   Fandong Meng, et al.
0

We propose DEEPMEMORY, a novel deep architecture for sequence-to-sequence learning, which performs the task through a series of nonlinear transformations from the representation of the input sequence (e.g., a Chinese sentence) to the final output sequence (e.g., translation to English). Inspired by the recently proposed Neural Turing Machine (Graves et al., 2014), we store the intermediate representations in stacked layers of memories, and use read-write operations on the memories to realize the nonlinear transformations between the representations. The types of transformations are designed in advance but the parameters are learned from data. Through layer-by-layer transformations, DEEPMEMORY can model complicated relations between sequences necessary for applications such as machine translation between distant languages. The architecture can be trained with normal back-propagation on sequenceto-sequence data, and the learning can be easily scaled up to a large corpus. DEEPMEMORY is broad enough to subsume the state-of-the-art neural translation model in (Bahdanau et al., 2015) as its special case, while significantly improving upon the model with its deeper architecture. Remarkably, DEEPMEMORY, being purely neural network-based, can achieve performance comparable to the traditional phrase-based machine translation system Moses with a small vocabulary and a modest parameter size.

READ FULL TEXT
research
08/16/2016

An Efficient Character-Level Neural Machine Translation

Neural machine translation aims at building a single large neural networ...
research
05/07/2021

Duplex Sequence-to-Sequence Learning for Reversible Machine Translation

Sequence-to-sequence (seq2seq) problems such as machine translation are ...
research
10/29/2016

Sequence-to-sequence neural network models for transliteration

Transliteration is a key component of machine translation systems and so...
research
12/08/2017

Sequence to Sequence Networks for Roman-Urdu to Urdu Transliteration

Neural Machine Translation models have replaced the conventional phrase ...
research
05/14/2019

Sparse Sequence-to-Sequence Models

Sequence-to-sequence models are a powerful workhorse of NLP. Most varian...
research
05/29/2022

The impact of memory on learning sequence-to-sequence tasks

The recent success of neural networks in machine translation and other f...
research
06/11/2022

Can the Language of the Collation be Translated into the Language of the Stemma? Using Machine Translation for Witness Localization

Stemmatology is a subfield of philology where one approach to understand...

Please sign up or login with your details

Forgot password? Click here to reset