A Survey of Large-Scale Deep Learning Serving System Optimization: Challenges and Opportunities

11/28/2021
by   Fuxun Yu, et al.
0

Deep Learning (DL) models have achieved superior performance in many application domains, including vision, language, medical, commercial ads, entertainment, etc. With the fast development, both DL applications and the underlying serving hardware have demonstrated strong scaling trends, i.e., Model Scaling and Compute Scaling, for example, the recent pre-trained model with hundreds of billions of parameters with  TB level memory consumption, as well as the newest GPU accelerators providing hundreds of TFLOPS. With both scaling trends, new problems and challenges emerge in DL inference serving systems, which gradually trends towards Large-scale Deep learning Serving systems (LDS). This survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems. By providing a novel taxonomy, summarizing the computing paradigms, and elaborating the recent technique advances, we hope that this survey could shed light on new optimization perspectives and motivate novel works in large-scale deep learning system optimization.

READ FULL TEXT

page 3

page 5

research
03/17/2022

A Survey of Multi-Tenant Deep Learning Inference on GPU

Deep Learning (DL) models have achieved superior performance. Meanwhile,...
research
06/27/2023

A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms

Recent trends in deep learning (DL) imposed hardware accelerators as the...
research
01/06/2023

Systems for Parallel and Distributed Large-Model Deep Learning Training

Deep learning (DL) has transformed applications in a variety of domains,...
research
06/09/2020

Hysia: Serving DNN-Based Video-to-Retail Applications in Cloud

Combining video streaming and online retailing (V2R) has been a growing ...
research
02/11/2022

Compute Trends Across Three Eras of Machine Learning

Compute, data, and algorithmic advances are the three fundamental factor...
research
08/16/2019

Survey on Deep Neural Networks in Speech and Vision Systems

This survey presents a review of state-of-the-art deep neural network ar...
research
10/28/2018

FFT, FMM, and Multigrid on the Road to Exascale: performance challenges and opportunities

FFT, FMM, and multigrid methods are widely used fast and highly scalable...

Please sign up or login with your details

Forgot password? Click here to reset