NTP : A Neural Network Topology Profiler

05/22/2019
by   Raghavendra Bhat, et al.
9

Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench (or) b) mandate a working implementation that is either framework specific or hardware-architecture specific (or) c) provide only high level benchmark metrics. In this paper, we present NTP (Neural Net Topology Profiler), a sophisticated benchmarking framework, to effectively identify memory and compute signature of an end-to-end topology on multiple hardware architectures, without the need to actually implement the topology in a framework. NTP is tightly integrated with hardware specific benchmark tools to enable exhaustive data collection and analysis. Using NTP, a deep learning researcher can quickly establish baselines needed to understand performance of an end-to-end neural network topology and make high level architectural decisions based on optimization techniques like layer sizing, quantization, pruning etc. Further, integration of NTP with frameworks like Tensorflow, Pytorch, Intel OpenVINO etc. allows for performance comparison along several vectors like a) Comparison of different frameworks on a given hardware b) Comparison of different hardware using a given framework c) Comparison across different heterogeneous hardware configurations for given framework etc. These capabilities empower a researcher to effortlessly make architectural decisions needed for achieving optimized performance on any hardware platform. The paper documents the architectural approach of NTP and demonstrates the capabilities of the tool by benchmarking Mozilla DeepSpeech, a popular Speech Recognition topology.

READ FULL TEXT

page 4

page 5

page 8

research
09/11/2019

QuTiBench: Benchmarking Neural Networks on Heterogeneous Hardware

Neural Networks have become one of the most successful universal machine...
research
03/15/2021

Autotuning Benchmarking Techniques: A Roofline Model Case Study

Peak performance metrics published by vendors often do not correspond to...
research
02/12/2018

TVM: End-to-End Optimization Stack for Deep Learning

Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch drive...
research
04/19/2020

HCM: Hardware-Aware Complexity Metric for Neural Network Architectures

Convolutional Neural Networks (CNNs) have become common in many fields i...
research
08/31/2020

Architectural Analysis of FPGA Technology Impact

The use of high-level languages for designing hardware is gaining popula...
research
03/17/2023

VPU-EM: An Event-based Modeling Framework to Evaluate NPU Performance and Power Efficiency at Scale

State-of-art NPUs are typically architected as a self-contained sub-syst...
research
11/17/2020

Ginkgo – A Math Library designed for Platform Portability

The first associations to software sustainability might be the existence...

Please sign up or login with your details

Forgot password? Click here to reset