
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID19 Detection with Chest CT Scans
The COVID19 pandemic has spread globally for several months. Because it...
read it

Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Distributed training techniques have been widely deployed in largescale...
read it

Efficient SparseDense MatrixMatrix Multiplication on GPUs Using the Customized Sparse Storage Format
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
read it

CommunicationEfficient Distributed Deep Learning: Survey, Evaluation, and Challenges
In recent years, distributed deep learning techniques are widely deploye...
read it

FADNet: A Fast and Accurate Network for Disparity Estimation
Deep neural networks (DNNs) have achieved great success in the area of c...
read it

CommunicationEfficient Distributed Deep Learning: A Comprehensive Survey
Distributed deep learning becomes very common to reduce the overall trai...
read it

Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Distributed Deep Learning (DDL) has rapidly grown its popularity since i...
read it

CommunicationEfficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Distributed learning techniques such as federated learning have enabled ...
read it

MGWFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
Distributed synchronous stochastic gradient descent has been widely used...
read it

Understanding Topk Sparsification in Distributed Deep Learning
Distributed stochastic gradient descent (SGD) algorithms are widely depl...
read it

Layerwise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
To reduce the long training time of large deep neural network (DNN) mode...
read it

ComputerAided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Skin disease is one of the most common types of human diseases, which ma...
read it

Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Deep neural networks (DNNs) have become widely used in many AI applicati...
read it

A Distributed Synchronous SGD Algorithm with Global Topk Sparsification for Low Bandwidth Networks
Distributed synchronous stochastic gradient descent (SSGD) with data pa...
read it

MGWFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Distributed synchronous stochastic gradient descent has been widely used...
read it

Highly Scalable Deep Learning Training System with MixedPrecision: Training ImageNet in Four Minutes
Synchronized stochastic gradient descent (SGD) optimizers with data para...
read it

Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
With huge amounts of training data, deep learning has made great breakth...
read it

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Deep learning frameworks have been widely deployed on GPU servers for de...
read it

Performance Evaluation of Deep Learning Tools in Docker Containers
With the success of deep learning techniques in a broad range of applica...
read it
Shaohuai Shi
is this you? claim profile