GEVO: GPU Code Optimization using Evolutionary Computation

04/17/2020
by   Jhe-Yu Liou, et al.
0

GPUs are a key enabler of the revolution in machine learning and high performance computing, functioning as de facto co-processors to accelerate large-scale computation. As the programming stack and tool support have matured, GPUs have also become accessible to programmers, who may lack detailed knowledge of the underlying architecture and fail to fully leverage the GPU's computation power. GEVO (Gpu optimization using EVOlutionary computation) is a tool for automatically discovering optimization opportunities and tuning the performance of GPU kernels in the LLVM representation. GEVO uses population-based search to find edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. We demonstrate that GEVO improves the execution time of the GPU programs in the Rodinia benchmark suite and the machine learning models, SVM and ResNet18, on NVIDIA Tesla P100. For the Rodinia benchmarks, GEVO improves GPU kernel runtime performance by an average of 49.48 over the fully compiler-optimized baseline. If kernel output accuracy is relaxed to tolerate up to 1 outperform the baseline version by an average of 51.08 learning workloads, GEVO achieves kernel performance improvement for SVM on the MNIST handwriting recognition (3.24X) and the a9a income prediction (2.93X) datasets with no loss of model accuracy. GEVO achieves 1.79X kernel performance improvement on image classification using ResNet18/CIFAR-10, with less than 1 model accuracy reduction.

READ FULL TEXT
research
04/17/2020

GEVO: GPU Code Optimization using EvolutionaryComputation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
08/25/2022

Understanding the Power of Evolutionary Computation for GPU Code Optimization

Achieving high performance for GPU codes requires developers to have sig...
research
04/21/2019

A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling

The ability to model, analyze, and predict execution time of computation...
research
03/01/2021

Accelerating Distributed-Memory Autotuning via Statistical Analysis of Execution Paths

The prohibitive expense of automatic performance tuning at scale has lar...
research
06/22/2023

ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code

Automatic code optimization is a complex process that typically involves...
research
10/21/2019

Performance Evaluation of Advanced Features in CUDA Unified Memory

CUDA Unified Memory improves the GPU programmability and also enables GP...
research
12/13/2020

Learning to Schedule Halide Pipelines for the GPU

We present a new algorithm to automatically generate high-performance GP...

Please sign up or login with your details

Forgot password? Click here to reset