b'Bharat Kaul'

research

∙ 04/14/2023

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks

Sparse training is emerging as a promising avenue for reducing the compu...

0 Abhisek Kundu, et al. ∙

research

∙ 04/16/2021

Efficient and Generic 1D Dilated Convolution Layer for Deep Learning

Convolutional neural networks (CNNs) have found many applications in tas...

0 Narendra Chaudhary, et al. ∙

research

∙ 04/12/2021

AI Powered Compiler Techniques for DL Code Optimization

Creating high performance implementations of deep learning primitives on...

0 Sanket Tavarageri, et al. ∙

research

∙ 03/19/2021

GNNerator: A Hardware/Software Framework for Accelerating Graph Neural Networks

Graph Neural Networks (GNNs) use a fully-connected layer to extract feat...

0 Jacob R. Stevens, et al. ∙

research

∙ 06/02/2020

PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives

Deep Neural Networks (DNNs) have revolutionized many aspects of our live...

0 Sanket Tavarageri, et al. ∙

research

∙ 02/06/2020

PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

At the heart of deep learning training and inferencing are computational...

0 Sanket Tavarageri, et al. ∙

research

∙ 01/15/2020

SEERL: Sample Efficient Ensemble Reinforcement Learning

Ensemble learning is a very prevalent method employed in machine learnin...

43 Rohan Saphal, et al. ∙

research

∙ 09/17/2019

K-TanH: Hardware Efficient Activations For Deep Learning

We propose K-TanH, a novel, highly accurate, hardware efficient approxim...

0 Abhisek Kundu, et al. ∙

research

∙ 08/29/2019

High Performance Scalable FPGA Accelerator for Deep Neural Networks

Low-precision is the first order knob for achieving higher Artificial In...

0 Sudarshan Srinivasan, et al. ∙

research

∙ 06/11/2019

Automatic Model Parallelism for Deep Neural Networks with Compiler and Hardware Support

The deep neural networks (DNNs) have been enormously successful in tasks...

0 Sanket Tavarageri, et al. ∙

research

∙ 05/29/2019

Mixed Precision Training With 8-bit Floating Point

Reduced precision computation for deep neural networks is one of the key...

0 Naveen Mellempudi, et al. ∙

research

∙ 05/29/2019

A Study of BFLOAT16 for Deep Learning Training

This paper presents the first comprehensive empirical study demonstratin...

0 Dhiraj Kalamkar, et al. ∙

research

∙ 09/04/2018

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers

As deep learning methods form a critical part in commercially important ...

0 Apoorv Vyas, et al. ∙

research

∙ 02/03/2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by...

0 Dipankar Das, et al. ∙

research

∙ 01/24/2018

On Scale-out Deep Learning Training for Cloud and HPC

The exponential growth in use of large deep neural networks has accelera...

0 Srinivas Sridharan, et al. ∙

research

∙ 07/20/2017

RAIL: Risk-Averse Imitation Learning

Imitation learning algorithms learn viable policies by imitating an expe...

0 Anirban Santara, et al. ∙

research

∙ 07/15/2017

Ternary Residual Networks

Sub-8-bit representation of DNNs incur some discernible loss of accuracy...

0 Abhisek Kundu, et al. ∙

research

∙ 05/02/2017

Ternary Neural Networks with Fine-Grained Quantization

We propose a novel fine-grained quantization (FGQ) method to ternarize p...

0 Naveen Mellempudi, et al. ∙

research

∙ 01/31/2017

Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point

We propose a cluster-based quantization method to convert pre-trained fu...

0 Naveen Mellempudi, et al. ∙

Bharat Kaul

Featured Co-authors

Sign in with Google

Consider DeepAI Pro