DeepAI AI Chat
Log In Sign Up

Prediction of Performance and Power Consumption of GPGPU Applications

by   Gargi Alavani, et al.

Graphics Processing Units (GPUs) have become an integral part of High-Performance Computing to achieve an Exascale performance. The main goal of application developers of GPU is to tune their code extensively to obtain optimal performance, making efficient use of different resources available. While extracting optimal performance of applications on an HPC infrastructure, developers should also ensure the applications have the least energy usage considering the massive power consumption of data centres and HPC servers. This thesis presents two models developed which can be utilized by developers in analysing the CUDA kernel's energy dissipation. The first one is a model that predicts the CUDA kernel's execution time. Here a PTX code is statically analysed to extract instruction features, control flow, and data dependence. We propose two scheduling algorithm approaches that satisfy the performance and hardware constraints. The second model is a static analysis-based power prediction built by utilizing machine learning techniques. Features used for building the model are derived using static analysis of PTX code. These features are chosen to understand the relationship between GPU power consumption and program features that can aid developers in building energy-efficient, sustainable applications. The dataset used for validating both models include kernels from different benchmarks suits, sizes, nature (e.g., compute-bound, memory-bound), and complexity (e.g., control divergence, memory access patterns). We also present a tool that has practically validated the effectiveness and ease of using the two models as design assistance tools for GPU.


page 21

page 37


A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels

Characterizing compute kernel execution behavior on GPUs for efficient t...

Design of an energy aware petaflops class high performance cluster based on power architecture

In this paper we present D.A.V.I.D.E. (Development for an Added Value In...

8 Steps to 3.7 TFLOP/s on NVIDIA V100 GPU: Roofline Analysis and Other Tricks

Performance optimization can be a daunting task especially as the hardwa...

Power Consumption Analysis of Parallel Algorithms on GPUs

Due to their highly parallel multi-cores architecture, GPUs are being in...

A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels

Future computing systems, from handhelds to supercomputers, will undoubt...

Understanding the Impact of Input Entropy on FPU, CPU, and GPU Power

Power is increasingly becoming a limiting resource in high-performance, ...