Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training

10/25/2020
by   Youngeun Kwon, et al.
4

Personalized recommendations are one of the most widely deployed machine learning (ML) workload serviced from cloud datacenters. As such, architectural solutions for high-performance recommendation inference have recently been the target of several prior literatures. Unfortunately, little have been explored and understood regarding the training side of this emerging ML workload. In this paper, we first perform a detailed workload characterization study on training recommendations, root-causing sparse embedding layer training as one of the most significant performance bottlenecks. We then propose our algorithm-architecture co-design called Tensor Casting, which enables the development of a generic accelerator architecture for tensor gather-scatter that encompasses all the key primitives of training embedding layers. When prototyped on a real CPU-GPU system, Tensor Casting provides 1.9-21x improvements in training throughput compared to state-of-the-art approaches.

READ FULL TEXT

page 3

page 6

page 7

page 12

research
05/12/2020

Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations

Personalized recommendations are the backbone machine learning (ML) algo...
research
05/10/2022

Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards

Personalized recommendation models (RecSys) are one of the most popular ...
research
08/26/2022

DiVa: An Accelerator for Differentially Private Machine Learning

The widespread deployment of machine learning (ML) is raising serious co...
research
10/12/2020

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions

Deep neural networks are widely used in personalized recommendation syst...
research
06/01/2022

Good Intentions: Adaptive Parameter Servers via Intent Signaling

Parameter servers (PSs) ease the implementation of distributed training ...
research
10/10/2020

Cross-Stack Workload Characterization of Deep Recommendation Systems

Deep learning based recommendation systems form the backbone of most per...
research
07/11/2019

The acute:chronic workload ratio: challenges and prospects for improvement

Injuries occur when an athlete performs a greater amount of activity (wo...

Please sign up or login with your details

Forgot password? Click here to reset