XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs

08/19/2019
by   Cheng Li, et al.
0

There has been a rapid proliferation of machine learning/deep learning (ML) models and wide adoption of them in many application domains. This has made profiling and characterization of ML model performance an increasingly pressing task for both hardware designers and system providers, as they would like to offer the best possible system to serve ML models with the target latency, throughput, cost, and energy requirements while maximizing resource utilization. Such an endeavor is challenging as the characteristics of an ML model depend on the interplay between the model, framework, system libraries, and the hardware (or the HW/SW stack). Existing profiling tools are disjoint, however, and only focus on profiling within a particular level of the stack, which limits the thoroughness and usefulness of the profiling results. This paper proposes XSP — an across-stack profiling design that gives a holistic and hierarchical view of ML model execution. XSP leverages distributed tracing to aggregate and correlate profile data from different sources. XSP introduces a leveled and iterative measurement approach that accurately captures the latencies at all levels of the HW/SW stack in spite of the profiling overhead. We couple the profiling design with an automated analysis pipeline to systematically analyze 65 state-of-the-art ML models. We demonstrate that XSP provides insights which would be difficult to discern otherwise.

READ FULL TEXT
research
08/19/2019

Across-Stack Profiling and Characterization of Machine Learning Models on GPUs

The world sees a proliferation of machine learning/deep learning (ML) mo...
research
01/05/2022

CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (tinyML) Acceleration on FPGAs

We present CFU Playground, a full-stack open-source framework that enabl...
research
04/25/2023

What Causes Exceptions in Machine Learning Applications? Mining Machine Learning-Related Stack Traces on Stack Overflow

Machine learning (ML), including deep learning, has recently gained trem...
research
02/19/2020

MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale

Machine Learning (ML) and Deep Learning (DL) innovations are being intro...
research
11/07/2022

DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems

Over the past decade, machine learning model complexity has grown at an ...
research
09/17/2021

Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters

Understanding and visualizing the full-stack performance trade-offs and ...
research
02/16/2022

BB-ML: Basic Block Performance Prediction using Machine Learning Techniques

Recent years have seen the adoption of Machine Learning (ML) techniques ...

Please sign up or login with your details

Forgot password? Click here to reset