MIOpen: An Open Source Library For Deep Learning Primitives

09/30/2019
by   Jehandad Khan, et al.
71

Deep Learning has established itself to be a common occurrence in the business lexicon. The unprecedented success of deep learning in recent years can be attributed to: abundance of data, availability of gargantuan compute capabilities offered by GPUs, and adoption of open-source philosophy by the researchers and industry. Deep neural networks can be decomposed into a series of different operators. MIOpen, AMD's open-source deep learning primitives library for GPUs, provides highly optimized implementations of such operators, shielding researchers from internal implementation details and hence, accelerating the time to discovery. This paper introduces MIOpen and provides details about the internal workings of the library and supported features. MIOpen innovates on several fronts, such as implementing fusion to optimize for memory bandwidth and GPU launch overheads, providing an auto-tuning infrastructure to overcome the large design space of problem configurations, and implementing different algorithms to optimize convolutions for different filter and input sizes. MIOpen is one of the first libraries to publicly support the bfloat16 data-type for convolutions, allowing efficient training at lower precision without the loss of accuracy.

READ FULL TEXT

page 1

page 9

page 10

research
11/22/2020

Differentiable Computational Geometry for 2D and 3D machine learning

With the growth of machine learning algorithms with geometry primitives,...
research
12/18/2018

wav2letter++: The Fastest Open-source Speech Recognition System

This paper introduces wav2letter++, the fastest open-source deep learnin...
research
08/17/2017

Designing and building the mlpack open-source machine learning library

mlpack is an open-source C++ machine learning library with an emphasis o...
research
04/27/2023

JaxPruner: A concise library for sparsity research

This paper introduces JaxPruner, an open-source JAX-based pruning and sp...
research
12/09/2022

Towards a learning-based performance modeling for accelerating Deep Neural Networks

Emerging applications such as Deep Learning are often data-driven, thus ...
research
09/29/2020

TorchRadon: Fast Differentiable Routines for Computed Tomography

This work presents TorchRadon – an open source CUDA library which contai...
research
02/06/2020

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

At the heart of deep learning training and inferencing are computational...

Please sign up or login with your details

Forgot password? Click here to reset