Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks

12/01/2020
by   Chuan-Chi Wang, et al.
0

In this paper, we provide a fine-grain machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can be used to predict the inference time and training time of the convolutional neural networks used in the application, which enables the system developer to optimize the performance by choosing the neural networks and/or incorporating the hardware accelerators to deliver satisfactory results in time. Furthermore, the proposed method is capable of predicting the performance of an unseen or non-existing device, e.g. a new GPU which has a higher operating frequency with less processor cores, but more memory capacity. This allows a system developer to quickly search the hardware design space and/or fine-tune the system configuration. Compared to the previous works, PerfNetV2 delivers more accurate results by modeling detailed host-accelerator interactions in executing the full neural networks and improving the architecture of the machine learning model used in the predictor. Our case studies show that PerfNetV2 yields a mean absolute percentage error within 13.1 rate on a previous work published in ICBD 2018 could be as large as 200

READ FULL TEXT
research
05/18/2021

TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators

There is increasing demand for specialized hardware for training deep ne...
research
10/23/2019

Sidebar: Scratchpad Based Communication Between CPUs and Accelerators

Hardware accelerators for neural networks have shown great promise for b...
research
12/03/2020

ResPerfNet: Deep Residual Learning for Regressional Performance Modeling of Deep Neural Networks

The rapid advancements of computing technology facilitate the developmen...
research
05/20/2016

Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Convolutional neural networks (CNN) have achieved major breakthroughs in...
research
11/22/2019

Gemmini: An Agile Systolic Array Generator Enabling Systematic Evaluations of Deep-Learning Architectures

Advances in deep learning and neural networks have resulted in the rapid...
research
03/05/2020

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures: A Machine Learning Based Approach

This article presents an automatic approach to quickly derive a good sol...
research
11/29/2021

A Graph Deep Learning Framework for High-Level Synthesis Design Space Exploration

The design of efficient hardware accelerators for high-throughput data-p...

Please sign up or login with your details

Forgot password? Click here to reset