ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer

12/30/2021
by   Yue Ju, et al.
0

The analysis of long sequence data remains challenging in many real-world applications. We propose a novel architecture, ChunkFormer, that improves the existing Transformer framework to handle the challenges while dealing with long time series. Original Transformer-based models adopt an attention mechanism to discover global information along a sequence to leverage the contextual data. Long sequential data traps local information such as seasonality and fluctuations in short data sequences. In addition, the original Transformer consumes more resources by carrying the entire attention matrix during the training course. To overcome these challenges, ChunkFormer splits the long sequences into smaller sequence chunks for the attention calculation, progressively applying different chunk sizes in each stage. In this way, the proposed model gradually learns both local and global information without changing the total length of the input sequences. We have extensively tested the effectiveness of this new architecture on different business domains and have proved the advantage of such a model over the existing Transformer-based models.

READ FULL TEXT

page 3

page 4

research
04/17/2020

ETC: Encoding Long and Structured Data in Transformers

Transformer-based models have pushed the state of the art in many natura...
research
12/14/2020

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Many real-world applications require the prediction of long sequence tim...
research
07/14/2022

Recurrent Memory Transformer

Transformer-based models show their effectiveness across multiple domain...
research
12/15/2021

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Recent work has shown that either (1) increasing the input length or (2)...
research
03/15/2022

Efficient Long Sequence Encoding via Synchronization

Pre-trained Transformer models have achieved successes in a wide range o...
research
12/24/2021

Tri-Transformer Hawkes Process: Three Heads are better than one

Abstract. Most of the real world data we encounter are asynchronous even...
research
08/07/2023

DSformer: A Double Sampling Transformer for Multivariate Time Series Long-term Prediction

Multivariate time series long-term prediction, which aims to predict the...

Please sign up or login with your details

Forgot password? Click here to reset