Performance Evaluation of Advanced Features in CUDA Unified Memory

10/21/2019
by   Steven W. D. Chien, et al.
0

CUDA Unified Memory improves the GPU programmability and also enables GPU memory oversubscription. Recently, two advanced memory features, memory advises and asynchronous prefetch, have been introduced. In this work, we evaluate the new features on two platforms that feature different CPUs, GPUs, and interconnects. We derive a benchmark suite for the experiments and stress the memory system to evaluate both in-memory and oversubscription performance. The results show that memory advises on the Intel-Volta/Pascal-PCIe platform bring negligible improvement for in-memory executions. However, when GPU memory is oversubscribed by about 50 performance improvement compared to the basic CUDA Unified Memory. In contrast, the Power9-Volta-NVLink platform can substantially benefit from memory advises, achieving up to 34 GPU memory is oversubscribed on this platform, using memory advises increases GPU page faults and results in considerable performance loss. The CUDA prefetch also shows different performance impact on the two platforms. It improves performance by up to 50 little benefit to the Power9-Volta-NVLink platform.

READ FULL TEXT

page 5

page 7

research
07/20/2020

UVMBench: A Comprehensive Benchmark Suite for Researching Unified Virtual Memory in GPUs

The recent introduction of Unified Virtual Memory (UVM) in GPUs offers a...
research
06/09/2021

Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers

For many, Graphics Processing Units (GPUs) provides a source of reliable...
research
06/13/2019

Thread Batching for High-performance Energy-efficient GPU Memory Design

Massive multi-threading in GPU imposes tremendous pressure on memory sub...
research
04/17/2020

GEVO: GPU Code Optimization using EvolutionaryComputation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
04/17/2020

GEVO: GPU Code Optimization using Evolutionary Computation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
03/17/2020

Co-Optimizing Performance and Memory FootprintVia Integrated CPU/GPU Memory Management, anImplementation on Autonomous Driving Platform

Cutting-edge embedded system applications, such as self-driving cars and...
research
04/06/2022

An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory

This paper proposes a novel intelligent framework for oversubscription m...

Please sign up or login with your details

Forgot password? Click here to reset