DCT: Dynamic Compressive Transformer for Modeling Unbounded Sequence

10/10/2021
by   Kai-Po Chang, et al.
0

In this paper, we propose Dynamic Compressive Transformer (DCT), a transformer-based framework for modeling the unbounded sequence. In contrast to the previous baselines which append every sentence representation to memory, conditionally selecting and appending them is a more reasonable solution to deal with unlimited long sequences. Our model uses a policy that determines whether the sequence should be kept in memory with a compressed state or discarded during the training process. With the benefits of retaining semantically meaningful sentence information in the memory system, our experiment results on Enwik8 benchmark show that DCT outperforms the previous state-of-the-art (SOTA) model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2020

Memformer: The Memory-Augmented Transformer

Transformer models have obtained remarkable accomplishments in various N...
research
11/13/2019

Compressive Transformers for Long-Range Sequence Modelling

We present the Compressive Transformer, an attentive sequence model whic...
research
01/13/2020

Reformer: The Efficient Transformer

Large Transformer models routinely achieve state-of-the-art results on a...
research
03/13/2023

Transformer-based World Models Are Happy With 100k Interactions

Deep neural networks have been successful in many reinforcement learning...
research
07/12/2020

Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation

Generating longer textual sequences when conditioned on the visual infor...
research
10/06/2021

ABC: Attention with Bounded-memory Control

Transformer architectures have achieved state-of-the-art results on a va...
research
10/11/2022

Planning Assembly Sequence with Graph Transformer

Assembly sequence planning (ASP) is the essential process for modern man...

Please sign up or login with your details

Forgot password? Click here to reset