Efficient Use of Limited-Memory Accelerators for Linear Learning on Heterogeneous Systems

08/17/2017
by   Celestine Dünner, et al.
0

We propose a generic algorithmic building block to accelerate training of machine learning models on heterogeneous compute systems. Our scheme allows to efficiently employ compute accelerators such as GPUs and FPGAs for the training of large-scale machine learning models, when the training data exceeds their memory capacity. Also, it provides adaptivity to any system's memory hierarchy in terms of size and processing speed. Our technique is built upon novel theoretical insights regarding primal-dual coordinate methods, and uses duality gap information to dynamically decide which part of the data should be made available for fast processing. To illustrate the power of our approach we demonstrate its performance for training of generalized linear models on a large-scale dataset exceeding the memory size of a modern GPU, showing an order-of-magnitude speedup over existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2020

Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems

Large-scale training is important to ensure high performance and accurac...
research
11/11/2020

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

The use of GPUs has proliferated for machine learning workflows and is n...
research
01/20/2021

Marius: Learning Massive Graph Embeddings on a Single Machine

We propose a new framework for computing the embeddings of large-scale g...
research
03/16/2018

Snap Machine Learning

We describe an efficient, scalable machine learning library that enables...
research
10/13/2021

Scalable Graph Embedding LearningOn A Single GPU

Graph embedding techniques have attracted growing interest since they co...
research
08/08/2020

GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification

One of the most efficient methods to solve L2-regularized primal problem...
research
06/22/2021

High Performance Optimization at the Door of the Exascale

quest for processing speed potential. In fact, we always get a fraction ...

Please sign up or login with your details

Forgot password? Click here to reset