Improving Autoregressive NLP Tasks via Modular Linearized Attention

04/17/2023
by   Victor Agostinelli, et al.
0

Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or in other resource-constrained environments. While prior research has reduced the size of these models, increasing computational efficiency without considerable performance impacts remains difficult, especially for autoregressive tasks. This paper proposes modular linearized attention (MLA), which combines multiple efficient attention mechanisms, including cosFormer, to maximize inference quality while achieving notable speedups. We validate this approach on several autoregressive NLP tasks, including speech-to-text neural machine translation (S2T NMT), speech-to-text simultaneous translation (SimulST), and autoregressive text-to-spectrogram, noting efficiency gains on TTS and competitive performance for NMT and SimulST during training and inference.

READ FULL TEXT

page 23

page 24

page 25

page 26

page 27

page 28

research
09/27/2021

Towards Reinforcement Learning for Pivot-based Neural Machine Translation with Non-autoregressive Transformer

Pivot-based neural machine translation (NMT) is commonly used in low-res...
research
11/06/2022

Parallel Attention Forcing for Machine Translation

Attention-based autoregressive models have achieved state-of-the-art per...
research
10/06/2020

An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks

Typically, tokenization is the very first step in most text processing w...
research
02/28/2015

The NLP Engine: A Universal Turing Machine for NLP

It is commonly accepted that machine translation is a more complex task ...
research
01/08/2022

Clustering Text Using Attention

Clustering Text has been an important problem in the domain of Natural L...
research
05/03/2020

How Does Selective Mechanism Improve Self-Attention Networks?

Self-attention networks (SANs) with selective mechanism has produced sub...
research
05/12/2020

DiscreTalk: Text-to-Speech as a Machine Translation Problem

This paper proposes a new end-to-end text-to-speech (E2E-TTS) model base...

Please sign up or login with your details

Forgot password? Click here to reset