Efficient Long Sequence Encoding via Synchronization

03/15/2022
by   Xiangyang Mou, et al.
0

Pre-trained Transformer models have achieved successes in a wide range of NLP tasks, but are inefficient when dealing with long input sequences. Existing studies try to overcome this challenge via segmenting the long sequence followed by hierarchical encoding or post-hoc aggregation. We propose a synchronization mechanism for hierarchical encoding. Our approach first identifies anchor tokens across segments and groups them by their roles in the original input sequence. Then inside Transformer layer, anchor embeddings are synchronized within their group via a self-attention module. Our approach is a general framework with sufficient flexibility – when adapted to a new task, it is easy to be enhanced with the task-specific anchor definitions. Experiments on two representative tasks with different types of long input texts, NarrativeQA summary setting and wild multi-hop reasoning from HotpotQA, demonstrate that our approach is able to improve the global information exchange among segments while maintaining efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2022

Recurrent Memory Transformer

Transformer-based models show their effectiveness across multiple domain...
research
06/01/2021

DoT: An efficient Double Transformer for NLP tasks with tables

Transformer-based approaches have been successfully used to obtain state...
research
02/18/2020

Conditional Self-Attention for Query-based Summarization

Self-attention mechanisms have achieved great success on a variety of NL...
research
12/30/2021

ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer

The analysis of long sequence data remains challenging in many real-worl...
research
06/26/2023

LongCoder: A Long-Range Pre-trained Language Model for Code Completion

In this paper, we introduce a new task for code completion that focuses ...
research
01/29/2023

Exploring Attention Map Reuse for Efficient Transformer Neural Networks

Transformer-based deep neural networks have achieved great success in va...

Please sign up or login with your details

Forgot password? Click here to reset