
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID19 Detection with Chest CT Scans
The COVID19 pandemic has spread globally for several months. Because it...
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Distributed training techniques have been widely deployed in largescale...
Performance Characterization and Bottleneck Analysis of Hyperledger Fabric
Hyperledger Fabric is a popular opensource project for deploying permis...
Efficient SparseDense MatrixMatrix Multiplication on GPUs Using the Customized Sparse Storage Format
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
CommunicationEfficient Distributed Deep Learning: Survey, Evaluation, and Challenges
In recent years, distributed deep learning techniques are widely deploye...
FADNet: A Fast and Accurate Network for Disparity Estimation
Deep neural networks (DNNs) have achieved great success in the area of c...
CommunicationEfficient Distributed Deep Learning: A Comprehensive Survey
Distributed deep learning becomes very common to reduce the overall trai...
Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Distributed Deep Learning (DDL) has rapidly grown its popularity since i...
FMore: An Incentive Scheme of Multidimensional Auction for Federated Learning in MEC
Promising federated learning coupled with Mobile Edge Computing (MEC) is...
CommunicationEfficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Distributed learning techniques such as federated learning have enabled ...
A Survey of Deep Learning Techniques for Neural Machine Translation
In recent years, natural language processing (NLP) has got great develop...
IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation
Indoor robotics localization, navigation and interaction heavily rely on...
MGWFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
Distributed synchronous stochastic gradient descent has been widely used...
Understanding Topk Sparsification in Distributed Deep Learning
Distributed stochastic gradient descent (SGD) algorithms are widely depl...
Layerwise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
To reduce the long training time of large deep neural network (DNN) mode...
ComputerAided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Skin disease is one of the most common types of human diseases, which ma...
Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Deep neural networks (DNNs) have become widely used in many AI applicati...
AutoML: A Survey of the StateoftheArt
Deep learning has penetrated all aspects of our lives and brought us gre...
The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study
Over the past years, great progress has been made in improving the compu...
Measurement and Analysis of the Bitcoin Networks: A View from Mining Pools
Mining pools, the main components of the Bitcoin network, dominate the c...
GPU Accelerated Keccak (SHA3) Algorithm
Hash functions like SHA1 or MD5 are one of the most important cryptogra...
GPU Accelerated AES Algorithm
It has been widely accepted that Graphics Processing Units (GPU) is one ...
A Distributed Synchronous SGD Algorithm with Global Topk Sparsification for Low Bandwidth Networks
Distributed synchronous stochastic gradient descent (SSGD) with data pa...
MGWFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Distributed synchronous stochastic gradient descent has been widely used...
Highly Scalable Deep Learning Training System with MixedPrecision: Training ImageNet in Four Minutes
Synchronized stochastic gradient descent (SGD) optimizers with data para...
Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
With huge amounts of training data, deep learning has made great breakth...
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Deep learning frameworks have been widely deployed on GPU servers for de...
Performance Evaluation of Deep Learning Tools in Docker Containers
With the success of deep learning techniques in a broad range of applica...
GPGPU Performance Estimation with Core and Memory Frequency Scaling
Graphics Processing Units (GPUs) support dynamic voltage and frequency s...
Xiaowen Chu
