
DistGNN: Scalable Distributed Training for LargeScale Graph Neural Networks
Fullbatch training on Graph Neural Networks (GNN) to learn the structur...
read it

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads
During the past decade, novel Deep Learning (DL) algorithms/workloads an...
read it

AI Powered Compiler Techniques for DL Code Optimization
Creating high performance implementations of deep learning primitives on...
read it

GNNerator: A Hardware/Software Framework for Accelerating Graph Neural Networks
Graph Neural Networks (GNNs) use a fullyconnected layer to extract feat...
read it

Deep Graph Library Optimizations for Intel(R) x86 Architecture
The Deep Graph Library (DGL) was designed as a tool to enable structure ...
read it

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Machine learning (ML) models are widely used in many domains including m...
read it

PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives
Deep Neural Networks (DNNs) have revolutionized many aspects of our live...
read it

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
At the heart of deep learning training and inferencing are computational...
read it

SEERL: Sample Efficient Ensemble Reinforcement Learning
Ensemble learning is a very prevalent method employed in machine learnin...
read it

High Performance Scalable FPGA Accelerator for Deep Neural Networks
Lowprecision is the first order knob for achieving higher Artificial In...
read it

HighPerformance Deep Learning via a Single Building Block
Deep learning (DL) is one of the most prominent branches of machine lear...
read it

A Study of BFLOAT16 for Deep Learning Training
This paper presents the first comprehensive empirical study demonstratin...
read it

Anatomy Of HighPerformance Deep Learning Convolutions On SIMD Architectures
Convolution layers are prevalent in many classes of deep neural networks...
read it

Hierarchical Block Sparse Neural Networks
Sparse deep neural networks(DNNs) are efficient in both memory and compu...
read it

Mixed Precision Training of Convolutional Neural Networks using Integer Operations
The stateoftheart (SOTA) for mixed precision training is dominated by...
read it

On Scaleout Deep Learning Training for Cloud and HPC
The exponential growth in use of large deep neural networks has accelera...
read it

RAIL: RiskAverse Imitation Learning
Imitation learning algorithms learn viable policies by imitating an expe...
read it
Sasikanth Avancha
is this you? claim profile