
Efficient SparseDense MatrixMatrix Multiplication on GPUs Using the Customized Sparse Storage Format
Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
read it

Rethinking Performance Estimation in Neural Architecture Search
Neural architecture search (NAS) remains a challenging problem, which is...
read it

A general method for finding the compositional inverses of permutations from the AGW criterion
Permutation polynomials and their compositional inverses have wide appli...
read it

DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks
Convolution operator is the core of convolutional neural networks (CNNs)...
read it

A Graph Joining Greedy Approach to Binary de Bruijn Sequences
Using greedy algorithms to generate de Bruijn sequences is a classical a...
read it

Data Poisoning Attacks on Federated Machine Learning
Federated machine learning which enables resource constrained node devic...
read it

FADNet: A Fast and Accurate Network for Disparity Estimation
Deep neural networks (DNNs) have achieved great success in the area of c...
read it

An Efficiently Generated Family of Binary de Bruijn Sequences
We study how to generate binary de Bruijn sequences efficiently from the...
read it

Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Distributed Deep Learning (DDL) has rapidly grown its popularity since i...
read it

Multilayer Representation Fusion for Neural Machine Translation
Neural machine translation systems require a number of stacked layers fo...
read it

Neural Machine Translation with Joint Representation
Though early successes of Statistical Machine Translation (SMT) systems ...
read it

Forceguided Highprecision Grasping Control of Fragile and Deformable Objects using sEMGbased Force Prediction
Regulating contact forces with high precision is crucial for grasping an...
read it

BETANAS: BalancEd TrAining and selective drop for Neural Architecture Search
Automatic neural architecture search techniques are becoming increasingl...
read it

Adversarial AutoAugment
Data augmentation (DA) has been widely utilized to improve generalizatio...
read it

IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation
Indoor robotics localization, navigation and interaction heavily rely on...
read it

Layerwise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
To reduce the long training time of large deep neural network (DNN) mode...
read it

General Criteria for Successor Rules to Efficiently Generate Binary de Bruijn Sequences
We put forward new general criteria to design successor rules that gener...
read it

Anchor Diffusion for Unsupervised Video Object Segmentation
Unsupervised video object segmentation has often been tackled by methods...
read it

Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Deep neural networks (DNNs) have become widely used in many AI applicati...
read it

Learning Deep Transformer Models for Machine Translation
Transformer is the stateoftheart model in recent machine translation ...
read it

The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study
Over the past years, great progress has been made in improving the compu...
read it

On the stability of periodic binary sequences with zone restriction
Traditional global stability measure for sequences is hard to determine ...
read it

A Distributed Synchronous SGD Algorithm with Global Topk Sparsification for Low Bandwidth Networks
Distributed synchronous stochastic gradient descent (SSGD) with data pa...
read it

Vector and Line Quantization for Billionscale Similarity Search on GPUs
Billionscale highdimensional approximate nearest neighbour (ANN) searc...
read it

SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Siamese network based trackers formulate tracking as convolutional featu...
read it

Fast Online Object Tracking and Segmentation: A Unifying Approach
In this paper we illustrate how to perform both visual object tracking a...
read it

Distractoraware Siamese Networks for Visual Object Tracking
Recently, Siamese networks have drawn great attention in visual tracking...
read it

Permutation polynomials and complete permutation polynomials over F_q^3
Motivated by many recent constructions of permutation polynomials over F...
read it

Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
With huge amounts of training data, deep learning has made great breakth...
read it

A Hybrid Recommendation Method Based on Feature for Offline Book Personalization
Recommendation system has been widely used in different areas. Collabora...
read it

A Recursive Construction of Permutation Polynomials over F_q^2 with Odd Characteristic from Rédei Functions
In this paper, we construct two classes of permutation polynomials over ...
read it

GPGPU Performance Estimation with Core and Memory Frequency Scaling
Graphics Processing Units (GPUs) support dynamic voltage and frequency s...
read it
Qiang Wang
is this you? claim profile