A reduced-precision streaming SpMV architecture for Personalized PageRank on FPGA

by   Alberto Parravicini, et al.

Sparse matrix-vector multiplication is often employed in many data-analytic workloads in which low latency and high throughput are more valuable than exact numerical convergence. FPGAs provide quick execution times while offering precise control over the accuracy of the results thanks to reduced-precision fixed-point arithmetic. In this work, we propose a novel streaming implementation of Coordinate Format (COO) sparse matrix-vector multiplication, and study its effectiveness when applied to the Personalized PageRank algorithm, a common building block of recommender systems in e-commerce websites and social networks. Our implementation achieves speedups up to 6x over a reference floating-point FPGA architecture and a state-of-the-art multi-threaded CPU implementation on 8 different data-sets, while preserving the numerical fidelity of the results and reaching up to 42x higher energy efficiency compared to the CPU implementation.



There are no comments yet.


page 1

page 2

page 3

page 4


Design and implementation of an out-of-order execution engine of floating-point arithmetic operations

In this thesis, work is undertaken towards the design in hardware descri...

MELOPPR: Software/Hardware Co-design for Memory-efficient Low-latency Personalized PageRank

Personalized PageRank (PPR) is a graph algorithm that evaluates the impo...

Experience with PCIe streaming on FPGA for high throughput ML inferencing

Achieving maximum possible rate of inferencing with minimum hardware res...

Efficient Floating-Point Givens Rotation Unit

High-throughput QR decomposition is a key operation in many advanced sig...

Sparse Tucker Tensor Decomposition on a Hybrid FPGA-CPU Platform

Recommendation systems, social network analysis, medical imaging, and da...

Near-Precise Parameter Approximation for Multiple Multiplications on A Single DSP Block

A multiply-accumulate (MAC) operation is the main computation unit for D...

Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations

Personalized recommendations are the backbone machine learning (ML) algo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.