Deep Learning based Data Prefetching in CPU-GPU Unified Virtual Memory

03/19/2022
by   Xinjian Long, et al.
0

Unified Virtual Memory (UVM) relieves the developers from the onus of maintaining complex data structures and explicit data migration by enabling on-demand data movement between CPU memory and GPU memory. However, on-demand paging soon becomes a performance bottleneck of UVM due to the high latency caused by page table walks and data migration over interconnect. Prefetching is considered a promising solution to this problem given its ability to leverage the locality of program memory access patterns. However, existing locality-based prefetching schemes can not handle all the situations. structures like arrays tend to be stored in contiguous blocks, and accessed repeatedly. An ideal prefetcher should not only look at narrow regions of the requested address space but also capture global context to deliver a good prediction of the memory access pattern. This paper proposes a novel approach for page prefetching for UVM through deep learning. We first show that a powerful Transformer learning model can provide high accuracy for UVM page prefetching. We then perform analysis to interpret this Transformer model and derive several insights that allow us to design a simpler model to match the unconstrained model's accuracy with orders of magnitude lower cost. We evaluate this simplified model on a set of 11 memory-intensive benchmarks from popular benchmark suites. Our solution outperforms the state-of-the-art UVM framework, improving the performance by 10.89 for prior art), and reducing the CPU-GPU interconnect traffic by 11.05 According to our proposed unified metric, which combines the accuracy, coverage, and page hit rate, our solution is approaching the ideal prefetching scheme more than the state-of-the-art design (0.90 vs. 0.85, with the perfect prefetcher of 1.0).

READ FULL TEXT

page 8

page 11

research
04/06/2022

An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory

This paper proposes a novel intelligent framework for oversubscription m...
research
05/29/2022

TransforMAP: Transformer for Memory Access Prediction

Data Prefetching is a technique that can hide memory latency by fetching...
research
07/20/2020

UVMBench: A Comprehensive Benchmark Suite for Researching Unified Virtual Memory in GPUs

The recent introduction of Unified Virtual Memory (UVM) in GPUs offers a...
research
07/02/2017

Deep-learning-based data page classification for holographic memory

We propose a deep-learning-based classification of data pages used in ho...
research
06/13/2019

Thread Batching for High-performance Energy-efficient GPU Memory Design

Massive multi-threading in GPU imposes tremendous pressure on memory sub...
research
03/23/2023

LearnedFTL: A Learning-based Page-level FTL for Improving Random Reads in Flash-based SSDs

We present LearnedFTL, which applies learned indexes to on-demand page-l...

Please sign up or login with your details

Forgot password? Click here to reset