
-
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
The COVID-19 pandemic has spread globally for several months. Because it...
read it
-
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Distributed training techniques have been widely deployed in large-scale...
read it
-
Performance Characterization and Bottleneck Analysis of Hyperledger Fabric
Hyperledger Fabric is a popular open-source project for deploying permis...
read it
-
Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
read it
-
Communication-Efficient Distributed Deep Learning: Survey, Evaluation, and Challenges
In recent years, distributed deep learning techniques are widely deploye...
read it
-
FADNet: A Fast and Accurate Network for Disparity Estimation
Deep neural networks (DNNs) have achieved great success in the area of c...
read it
-
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Distributed deep learning becomes very common to reduce the overall trai...
read it
-
Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Distributed Deep Learning (DDL) has rapidly grown its popularity since i...
read it
-
FMore: An Incentive Scheme of Multi-dimensional Auction for Federated Learning in MEC
Promising federated learning coupled with Mobile Edge Computing (MEC) is...
read it
-
Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Distributed learning techniques such as federated learning have enabled ...
read it
-
A Survey of Deep Learning Techniques for Neural Machine Translation
In recent years, natural language processing (NLP) has got great develop...
read it
-
IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation
Indoor robotics localization, navigation and interaction heavily rely on...
read it
-
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
Distributed synchronous stochastic gradient descent has been widely used...
read it
-
Understanding Top-k Sparsification in Distributed Deep Learning
Distributed stochastic gradient descent (SGD) algorithms are widely depl...
read it
-
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
To reduce the long training time of large deep neural network (DNN) mode...
read it
-
Computer-Aided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Skin disease is one of the most common types of human diseases, which ma...
read it
-
Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Deep neural networks (DNNs) have become widely used in many AI applicati...
read it
-
AutoML: A Survey of the State-of-the-Art
Deep learning has penetrated all aspects of our lives and brought us gre...
read it
-
The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study
Over the past years, great progress has been made in improving the compu...
read it
-
Measurement and Analysis of the Bitcoin Networks: A View from Mining Pools
Mining pools, the main components of the Bitcoin network, dominate the c...
read it
-
GPU Accelerated Keccak (SHA3) Algorithm
Hash functions like SHA-1 or MD5 are one of the most important cryptogra...
read it
-
GPU Accelerated AES Algorithm
It has been widely accepted that Graphics Processing Units (GPU) is one ...
read it
-
A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks
Distributed synchronous stochastic gradient descent (S-SGD) with data pa...
read it
-
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Distributed synchronous stochastic gradient descent has been widely used...
read it
-
Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
Synchronized stochastic gradient descent (SGD) optimizers with data para...
read it
-
Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
With huge amounts of training data, deep learning has made great breakth...
read it
-
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Deep learning frameworks have been widely deployed on GPU servers for de...
read it
-
Performance Evaluation of Deep Learning Tools in Docker Containers
With the success of deep learning techniques in a broad range of applica...
read it
-
GPGPU Performance Estimation with Core and Memory Frequency Scaling
Graphics Processing Units (GPUs) support dynamic voltage and frequency s...
read it