Giga-scale Kernel Matrix Vector Multiplication on GPU

02/02/2022
by   Robert Hu, et al.
0

Kernel matrix-vector multiplication (KMVM) is a foundational operation in machine learning and scientific computing. However, as KMVM tends to scale quadratically in both memory and time, applications are often limited by these computational constraints. In this paper, we propose a novel approximation procedure coined Faster-Fast and Free Memory Method () to address these scaling issues of KMVM for tall (10^8∼ 10^9) and skinny (D≤7) data. Extensive experiments demonstrate that has empirical linear time and memory complexity with a relative error of order 10^-3 and can compute a full KMVM for a billion points in under a minute on a high-end GPU, leading to a significant speed-up in comparison to existing CPU methods. We demonstrate the utility of our procedure by applying it as a drop-in for the state-of-the-art GPU-based linear solver FALKON, improving speed 1.5-5.5 times at the cost of <1% drop in accuracy. We further demonstrate competitive results on Gaussian Process regression coupled with significant speedups on a variety of real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2021

Fast ultrametric matrix-vector multiplication

We study the properties of ultrametric matrices aiming to design methods...
research
09/17/2023

Analog Content-Addressable Memory from Complementary FeFETs

To address the increasing computational demands of artificial intelligen...
research
02/03/2016

Inv-ASKIT: A Parallel Fast Diret Solver for Kernel Matrices

We present a parallel algorithm for computing the approximate factorizat...
research
06/08/2021

The Fast Kernel Transform

Kernel methods are a highly effective and widely used collection of mode...
research
04/13/2022

Explicit caching HYB: a new high-performance SpMV framework on GPGPU

Sparse Matrix-Vector Multiplication (SpMV) is a critical operation for t...
research
02/28/2017

Improving the Neural GPU Architecture for Algorithm Learning

Algorithm learning is a core problem in artificial intelligence with sig...
research
05/09/2020

In-memory eigenvector computation in time O(1)

In-memory computing with crosspoint resistive memory arrays has gained e...

Please sign up or login with your details

Forgot password? Click here to reset