
Energyaware Task Scheduling with Deadline Constraint in DVFSenabled Heterogeneous Clusters
Energy conservation of large data centers for highperformance computing...
read it

Efficient Multiobjective Evolutionary 3D Neural Architecture Search for COVID19 Detection with Chest CT Scans
COVID19 pandemic has spread globally for months. Due to its long incuba...
read it

BUTrace: A Permissionless Mobile System for PrivacyPreserving Intelligent Contact Tracing
The coronavirus disease 2019 (COVID19) pandemic has caused an unprecede...
read it

Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID19 Detection with Chest CT Scans
The COVID19 pandemic has spread globally for several months. Because it...
read it

Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Distributed training techniques have been widely deployed in largescale...
read it

Performance Characterization and Bottleneck Analysis of Hyperledger Fabric
Hyperledger Fabric is a popular opensource project for deploying permis...
read it

Efficient SparseDense MatrixMatrix Multiplication on GPUs Using the Customized Sparse Storage Format
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
read it

CommunicationEfficient Distributed Deep Learning: Survey, Evaluation, and Challenges
In recent years, distributed deep learning techniques are widely deploye...
read it

FADNet: A Fast and Accurate Network for Disparity Estimation
Deep neural networks (DNNs) have achieved great success in the area of c...
read it

CommunicationEfficient Distributed Deep Learning: A Comprehensive Survey
Distributed deep learning becomes very common to reduce the overall trai...
read it

Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Distributed Deep Learning (DDL) has rapidly grown its popularity since i...
read it

FMore: An Incentive Scheme of Multidimensional Auction for Federated Learning in MEC
Promising federated learning coupled with Mobile Edge Computing (MEC) is...
read it

CommunicationEfficient Decentralized Learning with Sparsification and Adaptive Peer Selection
Distributed learning techniques such as federated learning have enabled ...
read it

A Survey of Deep Learning Techniques for Neural Machine Translation
In recent years, natural language processing (NLP) has got great develop...
read it

IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation
Indoor robotics localization, navigation and interaction heavily rely on...
read it

MGWFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning
Distributed synchronous stochastic gradient descent has been widely used...
read it

Understanding Topk Sparsification in Distributed Deep Learning
Distributed stochastic gradient descent (SGD) algorithms are widely depl...
read it

Layerwise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
To reduce the long training time of large deep neural network (DNN) mode...
read it

ComputerAided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models
Skin disease is one of the most common types of human diseases, which ma...
read it

Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Deep neural networks (DNNs) have become widely used in many AI applicati...
read it

AutoML: A Survey of the StateoftheArt
Deep learning has penetrated all aspects of our lives and brought us gre...
read it

The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study
Over the past years, great progress has been made in improving the compu...
read it

Measurement and Analysis of the Bitcoin Networks: A View from Mining Pools
Mining pools, the main components of the Bitcoin network, dominate the c...
read it

GPU Accelerated Keccak (SHA3) Algorithm
Hash functions like SHA1 or MD5 are one of the most important cryptogra...
read it

GPU Accelerated AES Algorithm
It has been widely accepted that Graphics Processing Units (GPU) is one ...
read it

A Distributed Synchronous SGD Algorithm with Global Topk Sparsification for Low Bandwidth Networks
Distributed synchronous stochastic gradient descent (SSGD) with data pa...
read it

MGWFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Distributed synchronous stochastic gradient descent has been widely used...
read it

Highly Scalable Deep Learning Training System with MixedPrecision: Training ImageNet in Four Minutes
Synchronized stochastic gradient descent (SGD) optimizers with data para...
read it

Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
With huge amounts of training data, deep learning has made great breakth...
read it

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Deep learning frameworks have been widely deployed on GPU servers for de...
read it

Performance Evaluation of Deep Learning Tools in Docker Containers
With the success of deep learning techniques in a broad range of applica...
read it

GPGPU Performance Estimation with Core and Memory Frequency Scaling
Graphics Processing Units (GPUs) support dynamic voltage and frequency s...
read it
Xiaowen Chu
is this you? claim profile