Temporal Convolutional Attention-based Network For Sequence Modeling

02/28/2020
by   Hongyan Hao, et al.
0

With the development of feed-forward models, the default model for sequence modeling has gradually evolved to replace recurrent networks. Many powerful feed-forward models based on convolutional networks and attention mechanism were proposed and show more potential to handle sequence modeling tasks. We wonder that is there an architecture that can not only achieve an approximate substitution of recurrent network, but also absorb the advantages of feed-forward models. So we propose an exploratory architecture referred to Temporal Convolutional Attention-based Network (TCAN) which combines temporal convolutional network and attention mechanism. TCAN includes two parts, one is Temporal Attention (TA) which captures relevant features inside the sequence, the other is Enhanced Residual (ER) which extracts shallow layer's important information and transfers to deep layers. We improve the state-of-the-art results of bpc/perplexity to 26.92 on word-level PTB, 1.043 on character-level PTB, and 6.66 on WikiText-2.

READ FULL TEXT
research
10/28/2021

Understanding How Encoder-Decoder Architectures Attend

Encoder-decoder networks with attention have proven to be a powerful way...
research
02/01/2019

Exploring attention mechanism for acoustic-based classification of speech utterances into system-directed and non-system-directed

Voice controlled virtual assistants (VAs) are now available in smartphon...
research
02/08/2022

Modeling Structure with Undirected Neural Networks

Neural networks are powerful function estimators, leading to their statu...
research
12/05/2018

Summarizing Videos with Attention

In this work we propose a novel method for supervised, keyshots based vi...
research
10/13/2020

Unfolding recurrence by Green's functions for optimized reservoir computing

Cortical networks are strongly recurrent, and neurons have intrinsic tem...
research
12/27/2021

Augmenting Convolutional networks with attention-based aggregation

We show how to augment any convolutional network with an attention-based...
research
02/01/2023

Predicting CSI Sequences With Attention-Based Neural Networks

In this work, we consider the problem of multi-step channel prediction i...

Please sign up or login with your details

Forgot password? Click here to reset