Differentiable Window for Dynamic Local Attention

06/24/2020
by   Thanh-Tung Nguyen, et al.
0

We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection. While universally applicable, we demonstrate a compelling use case of utilizing Differentiable Window to improve standard attention modules by enabling more focused attentions over the input regions. We propose two variants of Differentiable Window, and integrate them within the Transformer architecture in two novel ways. We evaluate our proposed approach on a myriad of NLP tasks, including machine translation, sentiment analysis, subject-verb agreement and language modeling. Our experimental results demonstrate consistent and sizable improvements across all tasks.

READ FULL TEXT
research
08/23/2022

A differentiable short-time Fourier transform with respect to the window length

In this paper, we revisit the use of spectrograms in neural networks, by...
research
06/20/2020

Memory Transformer

Transformer-based models have achieved state-of-the-art results in many ...
research
11/28/2021

FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Inducing latent tree structures from sequential data is an emerging tren...
research
03/24/2022

Beyond Fixation: Dynamic Window Visual Transformer

Recently, a surge of interest in visual transformers is to reduce the co...
research
10/10/2016

A Dynamic Window Neural Network for CCG Supertagging

Combinatory Category Grammar (CCG) supertagging is a task to assign lexi...
research
06/01/2023

Masked Autoencoders with Multi-Window Attention Are Better Audio Learners

Several recent works have adapted Masked Autoencoders (MAEs) for learnin...
research
06/09/2016

MuFuRU: The Multi-Function Recurrent Unit

Recurrent neural networks such as the GRU and LSTM found wide adoption i...

Please sign up or login with your details

Forgot password? Click here to reset