VPU-EM: An Event-based Modeling Framework to Evaluate NPU Performance and Power Efficiency at Scale

03/17/2023
by   Charles Qi, et al.
0

State-of-art NPUs are typically architected as a self-contained sub-system with multiple heterogeneous hardware computing modules, and a dataflow-driven programming model. There lacks well-established methodology and tools in the industry to evaluate and compare the performance of NPUs from different architectures. We present an event-based performance modeling framework, VPU-EM, targeting scalable performance evaluation of modern NPUs across diversified AI workloads. The framework adopts high-level event-based system-simulation methodology to abstract away design details for speed, while maintaining hardware pipelining, concurrency and interaction with software task scheduling. It is natively developed in Python and built to interface directly with AI frameworks such as Tensorflow, PyTorch, ONNX and OpenVINO, linking various in-house NPU graph compilers to achieve optimized full model performance. Furthermore, VPU-EM also provides the capability to model power characteristics of NPU in Power-EM mode to enable joint performance/power analysis. Using VPU-EM, we conduct performance/power analysis of models from representative neural network architecture. We demonstrate that even though this framework is developed for Intel VPU, an Intel in-house NPU IP technology, the methodology can be generalized for analysis of modern NPUs.

READ FULL TEXT

page 2

page 4

research
04/13/2022

DRAGON (Differentiable Graph Execution) : A suite of Hardware Simulation and Optimization tools for Modern AI/Non-AI Workloads

We introduce DRAGON, an open-source, fast and explainable hardware simul...
research
10/14/2019

Characterizing Deep Learning Training Workloads on Alibaba-PAI

Modern deep learning models have been exploited in various domains, incl...
research
03/13/2022

First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites

Modern HPC systems are built with innovative system architectures and no...
research
05/30/2020

WattsApp: Power-Aware Container Scheduling

Containers are becoming a popular workload deployment mechanism in moder...
research
05/22/2019

NTP : A Neural Network Topology Profiler

Performance of end-to-end neural networks on a given hardware platform i...
research
04/07/2022

Predicting Performance of Heterogeneous AI Systems with Discrete-Event Simulations

In recent years, artificial intelligence (AI) technologies have found in...
research
06/17/2021

Characterization and Mitigation of Electromigration Effects in TSV-Based Power Delivery Network Enabled 3D-Stacked DRAMs

With 3D-stacked DRAM architectures becoming more prevalent, it has becom...

Please sign up or login with your details

Forgot password? Click here to reset