Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking

04/18/2018
by   Zhe Jia, et al.
0

Every year, novel NVIDIA GPU designs are introduced. This rapid architectural and technological progression, coupled with a reluctance by manufacturers to disclose low-level details, makes it difficult for even the most proficient GPU software designers to remain up-to-date with the technological advances at a microarchitectural level. To address this dearth of public, microarchitectural-level information on the novel NVIDIA GPUs, independent researchers have resorted to microbenchmarks-based dissection and discovery. This has led to a prolific line of publications that shed light on instruction encoding, and memory hierarchy's geometry and features at each level. Namely, research that describes the performance and behavior of the Kepler, Maxwell and Pascal architectures. In this technical report, we continue this line of research by presenting the microarchitectural details of the NVIDIA Volta architecture, discovered through microbenchmarks and instruction set disassembly. Additionally, we compare quantitatively our Volta findings against its predecessors, Kepler, Maxwell and Pascal.

READ FULL TEXT

page 1

page 9

page 35

research
03/18/2019

Dissecting the NVidia Turing T4 GPU via Microbenchmarking

In 2019, the rapid rate at which GPU manufacturers refresh their designs...
research
12/02/2019

GPU Support for Automatic Generation of Finite-Differences Stencil Kernels

The growth of data to be processed in the Oil Gas industry matches t...
research
09/09/2020

GPA: A GPU Performance Advisor Based on Instruction Sampling

Developing efficient GPU kernels can be difficult because of the complex...
research
01/26/2023

A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code

Various kinds of applications take advantage of GPUs through automation ...
research
05/27/2020

Optimization of Tensor-product Operations in Nekbone on GPUs

In the CFD solver Nek5000, the computation is dominated by the evaluatio...
research
01/28/2022

Puppeteer: A Random Forest-based Manager for Hardware Prefetchers across the Memory Hierarchy

Over the years, processor throughput has steadily increased. However, th...

Please sign up or login with your details

Forgot password? Click here to reset