BagPipe: Accelerating Deep Recommendation Model Training

02/24/2022
by   Saurabh Agarwal, et al.
3

Deep learning based recommendation models (DLRM) are widely used in several business critical applications. Training such recommendation models efficiently is challenging primarily because they consist of billions of embedding-based parameters which are often stored remotely leading to significant overheads from embedding access. By profiling existing DLRM training, we observe that only 8.5 remaining time is spent on embedding and model synchronization. Our key insight in this paper is that access to embeddings have a specific structure and pattern which can be used to accelerate training. We observe that embedding accesses are heavily skewed, with almost 1 92 lookahead at future batches to determine exactly which embeddings will be needed at what iteration in the future. Based on these insight, we propose Bagpipe, a system for training deep recommendation models that uses caching and prefetching to overlap remote embedding accesses with the computation. We designed an Oracle Cacher, a new system component which uses our lookahead algorithm to generate optimal cache update decisions and provide strong consistency guarantees. Our experiments using three datasets and two models shows that our approach provides a speed up of up to 6.2x compared to state of the art baselines, while providing the same convergence and reproducibility guarantees as synchronous training.

READ FULL TEXT

page 15

page 16

research
12/14/2021

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework

Embedding models have been an effective learning paradigm for high-dimen...
research
08/08/2022

A Frequency-aware Software Cache for Large Recommendation System Embeddings

Deep learning recommendation models (DLRMs) have been widely applied in ...
research
03/01/2021

High-Performance Training by Exploiting Hot-Embeddings in Recommendation Systems

Recommendation models are commonly used learning models that suggest rel...
research
07/16/2021

Look Ahead ORAM: Obfuscating Addresses in Recommendation Model Training

In the cloud computing era, data privacy is a critical concern. Memory a...
research
08/26/2019

Graph Embedding Based Hybrid Social Recommendation System

Item recommendation tasks are a widely studied topic. Recent development...
research
02/24/2021

Semantically Constrained Memory Allocation (SCMA) for Embedding in Efficient Recommendation Systems

Deep learning-based models are utilized to achieve state-of-the-art perf...
research
01/17/2013

Affinity Weighted Embedding

Supervised (linear) embedding models like Wsabie and PSI have proven suc...

Please sign up or login with your details

Forgot password? Click here to reset