An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

03/04/2018
by   Shaojie Bai, et al.
0

For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2016

Neural Associative Memory for Dual-Sequence Modeling

Many important NLP problems can be posed as dual-sequence or sequence-to...
research
07/17/2016

An Empirical Evaluation of various Deep Learning Architectures for Bi-Sequence Classification Tasks

Several tasks in argumentation mining and debating, question-answering, ...
research
10/31/2018

An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction

Data of sequential nature arise in many application domains in forms of,...
research
09/04/2019

Deep Convolutional Networks in System Identification

Recent developments within deep learning are relevant for nonlinear syst...
research
05/10/2018

Deep Neural Machine Translation with Weakly-Recurrent Units

Recurrent neural networks (RNNs) have represented for years the state of...
research
05/02/2023

Sequence Modeling with Multiresolution Convolutional Memory

Efficiently capturing the long-range patterns in sequential data sources...
research
05/21/2020

Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism

Formant tracking is one of the most fundamental problems in speech proce...

Please sign up or login with your details

Forgot password? Click here to reset