Constant Memory Attention Block

06/21/2023
by   Leo Feng, et al.
0

Modern foundation model architectures rely on attention mechanisms to effectively capture context. However, these methods require linear or quadratic memory in terms of the number of inputs/datapoints, limiting their applicability in low-compute domains. In this work, we propose Constant Memory Attention Block (CMAB), a novel general-purpose attention block that computes its output in constant memory and performs updates in constant computation. Highlighting CMABs efficacy, we introduce methods for Neural Processes and Temporal Point Processes. Empirically, we show our proposed methods achieve results competitive with state-of-the-art while being significantly more memory efficient.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Constant Memory Attentive Neural Processes

Neural Processes (NPs) are efficient methods for estimating predictive u...
research
11/15/2022

Latent Bottlenecked Attentive Neural Processes

Neural Processes (NPs) are popular methods in meta-learning that can est...
research
06/15/2022

Self-Supervised Implicit Attention: Guided Attention by The Model Itself

We propose Self-Supervised Implicit Attention (SSIA), a new approach tha...
research
09/14/2021

Space Time Recurrent Memory Network

We propose a novel visual memory network architecture for the learning a...
research
12/19/2014

A la Carte - Learning Fast Kernels

Kernel methods have great promise for learning rich statistical represen...
research
06/06/2023

Exploiting Scratchpad Memory for Deep Temporal Blocking: A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)

General Purpose Graphics Processing Units (GPGPU) are used in most of th...
research
10/23/2018

Area Attention

Existing attention mechanisms, are mostly item-based in that a model is ...

Please sign up or login with your details

Forgot password? Click here to reset