
LowPrecision Hardware Architectures Meet Recommendation Model Inference at Scale
Tremendous success of machine learning (ML) and the unabated growth in M...
read it

Efficient SoftError Detection for Lowprecision Deep Learning Recommendation Models
Soft error, namely silent corruption of signal or datum in a computer sy...
read it

MixedPrecision Embedding Using a Cache
In recommendation systems, practitioners observed that increase in the n...
read it

Fast Distributed Training of Deep Neural Networks: Dynamic Communication Thresholding for Model and Data Parallelism
Data Parallelism (DP) and Model Parallelism (MP) are two common paradigm...
read it

Leveraging the bfloat16 Artificial Intelligence Datatype For HigherPrecision Computations
In recent years fusedmultiplyadd (FMA) units with lowerprecision mult...
read it

Dictionary Learning by Dynamical Neural Networks
A dynamical neural network consists of a set of interconnected neurons t...
read it

A Progressive Batching LBFGS Method for Machine Learning
The standard LBFGS method relies on gradient approximations that are no...
read it

Sparse Coding by Spiking Neural Networks: Convergence Theory and Computational Results
In a spiking neural network (SNN), individual neurons operate autonomous...
read it

Enabling Sparse Winograd Convolution by Native Pruning
Sparse methods and the use of Winograd convolutions are two orthogonal a...
read it

Faster CNNs with Direct Sparse Convolutions and Guided Pruning
Phenomenally successful in practical inference problems, convolutional n...
read it
Ping Tak Peter Tang
is this you? claim profile