NPS: A Framework for Accurate Program Sampling Using Graph Neural Network

04/18/2023
by   Yuanwei Fang, et al.
0

With the end of Moore's Law, there is a growing demand for rapid architectural innovations in modern processors, such as RISC-V custom extensions, to continue performance scaling. Program sampling is a crucial step in microprocessor design, as it selects representative simulation points for workload simulation. While SimPoint has been the de-facto approach for decades, its limited expressiveness with Basic Block Vector (BBV) requires time-consuming human tuning, often taking months, which impedes fast innovation and agile hardware development. This paper introduces Neural Program Sampling (NPS), a novel framework that learns execution embeddings using dynamic snapshots of a Graph Neural Network. NPS deploys AssemblyNet for embedding generation, leveraging an application's code structures and runtime states. AssemblyNet serves as NPS's graph model and neural architecture, capturing a program's behavior in aspects such as data computation, code path, and data flow. AssemblyNet is trained with a data prefetch task that predicts consecutive memory addresses. In the experiments, NPS outperforms SimPoint by up to 63 average error by 38 increased accuracy, reducing the expensive accuracy tuning overhead. Furthermore, NPS shows higher accuracy and generality than the state-of-the-art GNN approach in code behavior learning, enabling the generation of high-quality execution embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2019

Learning Blended, Precise Semantic Program Embeddings

Learning neural program embeddings is key to utilizing deep neural netwo...
research
07/03/2019

A Hybrid Approach for Learning Program Representations

Learning neural program embedding is the key to utilizing deep neural ne...
research
11/18/2020

GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection

Production software oftentimes suffers from the issue of performance ine...
research
12/30/2017

A Loop-Based Methodology for Reducing Computational Redundancy in Workload Sets

The design of general purpose processors relies heavily on a workload ga...
research
06/17/2019

Learning Execution through Neural Code Fusion

As the performance of computer systems stagnates due to the end of Moore...
research
09/10/2020

MicroGrad: A Centralized Framework for Workload Cloning and Stress Testing

We present MicroGrad, a centralized automated framework that is able to ...
research
10/08/2022

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

Analytical hardware performance models yield swift estimation of desired...

Please sign up or login with your details

Forgot password? Click here to reset