DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference

01/08/2020
by   Udit Gupta, et al.
0

Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure. Thus, improving the execution efficiency of neural recommendation directly translates into infrastructure capacity saving. In this paper, we devise a novel end-to-end modeling infrastructure, DeepRecInfra, that adopts an algorithm and system co-design methodology to custom-design systems for recommendation use cases. Leveraging the insights from the recommendation characterization, a new dynamic scheduler, DeepRecSched, is proposed to maximize latency-bounded throughput by taking into account characteristics of inference query size and arrival patterns, recommendation model architectures, and underlying hardware systems. By doing so, system throughput is doubled across the eight industry-representative recommendation models. Finally, design, deployment, and evaluation in at-scale production datacenter shows over 30 hundreds of machines.

READ FULL TEXT

page 1

page 4

page 8

research
03/14/2022

Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation

Personalized recommendation is an important class of deep-learning appli...
research
10/10/2020

Cross-Stack Workload Characterization of Deep Recommendation Systems

Deep learning based recommendation systems form the backbone of most per...
research
06/06/2019

The Architectural Implications of Facebook's DNN-based Personalized Recommendation

The widespread application of deep learning has changed the landscape of...
research
01/29/2021

RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

Neural personalized recommendation models are used across a wide variety...
research
11/09/2022

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

We present RecD (Recommendation Deduplication), a suite of end-to-end in...
research
05/18/2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Deep learning recommendation systems must provide high quality, personal...
research
06/03/2021

JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

In modern internet industries, deep learning based recommender systems h...

Please sign up or login with your details

Forgot password? Click here to reset