Time-Based Roofline for Deep Learning Performance Analysis

09/09/2020
by   Charlene Yang, et al.
0

Deep learning applications are usually very compute-intensive and require a long run time for training and inference. This has been tackled by researchers from both hardware and software sides, and in this paper, we propose a Roofline-based approach to performance analysis to facilitate the optimization of these applications. This approach is an extension of the Roofline model widely used in traditional high-performance computing applications, and it incorporates both compute/bandwidth complexity and run time in its formulae to provide insights into deep learning-specific characteristics. We take two sets of representative kernels, 2D convolution and long short-term memory, to validate and demonstrate the use of this new approach, and investigate how arithmetic intensity, cache locality, auto-tuning, kernel launch overhead, and Tensor Core usage can affect performance. Compared to the common ad-hoc approach, this study helps form a more systematic way to analyze code performance and identify optimization opportunities for deep learning applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/11/2020

Hierarchical Roofline Performance Analysis for Deep Learning Applications

This paper presents a practical methodology for collecting performance d...
research
06/20/2022

Performance Prediction in Major League Baseball by Long Short-Term Memory Networks

Player performance prediction is a serious problem in every sport since ...
research
06/30/2017

Applying the Polyhedral Model to Tile Time Loops in Devito

The run time of many scientific computation applications for numerical m...
research
04/16/2019

swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture

The flourish of deep learning frameworks and hardware platforms has been...
research
06/18/2021

Distributed Deep Learning in Open Collaborations

Modern deep learning applications require increasingly more compute to t...
research
08/15/2021

Sonic: A Sampling-based Online Controller for Streaming Applications

Many applications in important problem domains such as machine learning ...
research
11/09/2017

Performance Evaluation of Deep Learning Tools in Docker Containers

With the success of deep learning techniques in a broad range of applica...

Please sign up or login with your details

Forgot password? Click here to reset