Modeling Deep Learning Accelerator Enabled GPUs

11/19/2018
by   Md Aamir Raihan, et al.
0

The efficacy of deep learning has resulted in its use in a growing number of applications. The Volta graphics processor unit (GPU) architecture from NVIDIA introduced a specialized functional unit, the "tensor core", that helps meet the growing demand for higher performance for deep learning. In this paper we study the design of the tensor cores in NVIDIA's Volta and Turing architectures. We further propose an architectural model for the tensor cores in Volta. When implemented a GPU simulator, GPGPU-Sim, our tensor core model achieves 99.6% correlation versus an NVIDIA Titan V GPU in terms of average instructions per cycle when running tensor core enabled GEMM workloads. We also describe support added to enable GPGPU-Sim to run CUTLASS, an open-source CUDA C++ template library providing customizable GEMM templates that utilize tensor cores.

READ FULL TEXT

page 1

page 2

page 5

page 6

research
03/11/2018

NVIDIA Tensor Core Programmability, Performance & Precision

The NVIDIA Volta GPU microarchitecture introduces a specialized unit, ca...
research
10/15/2018

MGSim + MGMark: A Framework for Multi-GPU System Research

The rapidly growing popularity and scale of data-parallel workloads dema...
research
11/18/2018

Analyzing Machine Learning Workloads Using a Detailed GPU Simulator

Most deep neural networks deployed today are trained using GPUs via high...
research
10/20/2018

The Ocean Tensor Package

Matrix and tensor operations form the basis of a wide range of fields an...
research
04/12/2023

Programming Language Assisted Waveform Analysis: A Case Study on the Instruction Performance of SERV

RISC-Vs growing traction leads to the release of new RISC-V cores on a n...
research
11/27/2020

High-Throughput Parallel Viterbi Decoder on GPU Tensor Cores

Many research works have been performed on implementation of Vitrerbi de...
research
11/07/2019

MERIT: Tensor Transform for Memory-Efficient Vision Processing on Parallel Architectures

Computationally intensive deep neural networks (DNNs) are well-suited to...

Please sign up or login with your details

Forgot password? Click here to reset