Solving Attention Kernel Regression Problem via Pre-conditioner

08/28/2023
by   Zhao Song, et al.
0

Large language models have shown impressive performance in many tasks. One of the major features from the computation perspective is computing the attention matrix. Previous works [Zandieh, Han, Daliri, and Karba 2023, Alman and Song 2023] have formally studied the possibility and impossibility of approximating the attention matrix. In this work, we define and study a new problem which is called the attention kernel regression problem. We show how to solve the attention kernel regression in the input sparsity time of the data matrix.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset