ZnG: Architecting GPU Multi-Processors with New Flash for Scalable Data Analysis

06/16/2020
by   Jie Zhang, et al.
0

We propose ZnG, a new GPU-SSD integrated architecture, which can maximize the memory capacity in a GPU and address performance penalties imposed by an SSD. Specifically, ZnG replaces all GPU internal DRAMs with an ultra-low-latency SSD to maximize the GPU memory capacity. ZnG further removes performance bottleneck of the SSD by replacing its flash channels with a high-throughput flash network and integrating SSD firmware in the GPU's MMU to reap the benefits of hardware accelerations. Although flash arrays within the SSD can deliver high accumulated bandwidth, only a small fraction of such bandwidth can be utilized by GPU's memory requests due to mismatches of their access granularity. To address this, ZnG employs a large L2 cache and flash registers to buffer the memory requests. Our evaluation results indicate that ZnG can achieve 7.5x higher performance than prior work.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

page 9

page 10

research
09/12/2021

Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors

Traditional graphics processing units (GPUs) suffer from the low memory ...
research
08/24/2020

Tearing Down the Memory Wall

We present a vision for the Erudite architecture that redefines the comp...
research
10/08/2019

Performance Impact of Memory Channels on Sparse and Irregular Algorithms

Graph processing is typically considered to be a memory-bound rather tha...
research
07/30/2023

Exploiting Parallel Memory Write Requests for Covert Channel Attacks in Integrated CPU-GPU Systems

In heterogeneous SoCs, accelerators like integrated GPUs (iGPUs) are int...
research
05/08/2023

A Case for CXL-Centric Server Processors

The memory system is a major performance determinant for server processo...
research
06/12/2020

EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs

Modern analytics and recommendation systems are increasingly based on gr...
research
10/19/2020

Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation

Graphics Processing Units (GPUs) employ large register files to accommod...

Please sign up or login with your details

Forgot password? Click here to reset