An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory

04/06/2022
by   Xinjian Long, et al.
0

This paper proposes a novel intelligent framework for oversubscription management in CPU-GPU UVM. We analyze the current rule-based methods of GPU memory oversubscription with unified memory, and the current learning-based methods for other computer architectural components. We then identify the performance gap between the existing rule-based methods and the theoretical upper bound. We also identify the advantages of applying machine intelligence and the limitations of the existing learning-based methods. This paper proposes a novel intelligent framework for oversubscription management in CPU-GPU UVM. It consists of an access pattern classifier followed by a pattern-specific Transformer-based model using a novel loss function aiming for reducing page thrashing. A policy engine is designed to leverage the model's result to perform accurate page prefetching and pre-eviction. We evaluate our intelligent framework on a set of 11 memory-intensive benchmarks from popular benchmark suites. Our solution outperforms the state-of-the-art (SOTA) methods for oversubscription management, reducing the number of pages thrashed by 64.4% under 125% memory oversubscription compared to the baseline, while the SOTA method reduces the number of pages thrashed by 17.3%. Our solution achieves an average IPC improvement of 1.52X under 125% memory oversubscription, and our solution achieves an average IPC improvement of 3.66X under 150% memory oversubscription. Our solution outperforms the existing learning-based methods for page address prediction, improving top-1 accuracy by 6.45% (up to 41.2%) on average for a single GPGPU workload, improving top-1 accuracy by 10.2% (up to 30.2%) on average for multiple concurrent GPGPU workloads.

READ FULL TEXT

page 1

page 7

page 11

research
03/19/2022

Deep Learning based Data Prefetching in CPU-GPU Unified Virtual Memory

Unified Virtual Memory (UVM) relieves the developers from the onus of ma...
research
07/20/2023

FHPM: Fine-grained Huge Page Management For Virtualization

As more data-intensive tasks with large footprints are deployed in virtu...
research
04/30/2018

Mosaic: An Application-Transparent Hardware-Software Cooperative Memory Manager for GPUs

Modern GPUs face a trade-off on how the page size used for memory manage...
research
05/29/2022

TransforMAP: Transformer for Memory Access Prediction

Data Prefetching is a technique that can hide memory latency by fetching...
research
03/17/2020

Co-Optimizing Performance and Memory FootprintVia Integrated CPU/GPU Memory Management, anImplementation on Autonomous Driving Platform

Cutting-edge embedded system applications, such as self-driving cars and...
research
10/21/2019

Performance Evaluation of Advanced Features in CUDA Unified Memory

CUDA Unified Memory improves the GPU programmability and also enables GP...
research
09/22/2022

Deep Learning Based Page Creation for Improving E-Commerce Organic Search Traffic

Organic search comprises a large portion of the total traffic for e-comm...

Please sign up or login with your details

Forgot password? Click here to reset