Diagonal State Spaces are as Effective as Structured State Spaces

03/27/2022
by   Ankit Gupta, et al.
0

Modeling long range dependencies in sequential data is a fundamental step towards attaining human-level performance in many modalities such as text, vision and audio. While attention-based models are a popular and effective choice in modeling short-range interactions, their performance on tasks requiring long range reasoning has been largely inadequate. In a breakthrough result, Gu et al. (2022) proposed the Structured State Space (S4) architecture delivering large gains over state-of-the-art models on several long-range tasks across various modalities. The core proposition of S4 is the parameterization of state matrices via a diagonal plus low rank structure, allowing efficient computation. In this work, we show that one can match the performance of S4 even without the low rank correction and thus assuming the state matrices to be diagonal. Our Diagonal State Space (DSS) model matches the performance of S4 on Long Range Arena tasks, speech classification on Speech Commands dataset, while being conceptually simpler and straightforward to implement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2022

Liquid Structural State-Space Models

A proper parametrization of state transition matrices of linear state-sp...
research
10/31/2021

Efficiently Modeling Long Sequences with Structured State Spaces

A central goal of sequence modeling is designing a single principled mod...
research
06/23/2022

On the Parameterization and Initialization of Diagonal State Space Models

State space models (SSM) have recently been shown to be very effective a...
research
12/01/2022

Simplifying and Understanding State Space Models with Diagonal Linear RNNs

Sequence models based on linear state spaces (SSMs) have recently emerge...
research
06/27/2022

Long Range Language Modeling via Gated State Spaces

State space models have shown to be effective at modeling long range dep...
research
10/17/2022

What Makes Convolutional Models Great on Long Sequence Modeling?

Convolutional models have been widely used in multiple domains. However,...
research
03/07/2023

Structured State Space Models for In-Context Reinforcement Learning

Structured state space sequence (S4) models have recently achieved state...

Please sign up or login with your details

Forgot password? Click here to reset