RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

12/30/2019
by   Liu Ke, et al.
0

Personalized recommendation systems leverage deep learning models and account for the majority of data center AI cycles. Their performance is dominated by memory-bound sparse embedding operations with unique irregular memory access patterns that pose a fundamental challenge to accelerate. This paper proposes a lightweight, commodity DRAM compliant, near-memory processing solution to accelerate personalized recommendation inference. The in-depth characterization of production-grade recommendation models shows that embedding operations with high model-, operator- and data-level parallelism lead to memory bandwidth saturation, limiting recommendation inference performance. We propose RecNMP which provides a scalable solution to improve system throughput, supporting a broad range of sparse embedding models. RecNMP is specifically tailored to production environments with heavy co-location of operators on a single server. Several hardware/software co-optimization techniques such as memory-side caching, table-aware packet scheduling, and hot entry profiling are studied, resulting in up to 9.8x memory latency speedup over a highly-optimized baseline. Overall, RecNMP offers 4.2x throughput improvement and 45.8 energy savings.

READ FULL TEXT

page 6

page 8

research
10/12/2020

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions

Deep neural networks are widely used in personalized recommendation syst...
research
01/29/2021

RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

Neural personalized recommendation models are used across a wide variety...
research
06/06/2019

The Architectural Implications of Facebook's DNN-based Personalized Recommendation

The widespread application of deep learning has changed the landscape of...
research
12/02/2022

DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation

Deep learning-based personalized recommendation systems are widely used ...
research
04/12/2021

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Deep learning recommendation models (DLRMs) are used across many busines...
research
10/21/2021

Supporting Massive DLRM Inference Through Software Defined Memory

Deep Learning Recommendation Models (DLRM) are widespread, account for a...
research
01/25/2022

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

We propose RecShard, a fine-grained embedding table (EMB) partitioning a...

Please sign up or login with your details

Forgot password? Click here to reset