
-
SelfNorm and CrossNorm for Out-of-Distribution Robustness
Normalization techniques are crucial in stabilizing and accelerating the...
read it
-
Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning
Multi-choice Machine Reading Comprehension (MMRC) aims to select the cor...
read it
-
FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems
Graph neural networks (GNNs) are gaining increasing popularity as a prom...
read it
-
CSER: Communication-efficient SGD with Error Reset
The scalability of Distributed Stochastic Gradient Descent (SGD) is toda...
read it
-
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
BERT has recently attracted a lot of attention in natural language under...
read it
-
Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference
Modern deep neural networks increasingly make use of features such as dy...
read it
-
Learning Context-Based Non-local Entropy Modeling for Image Compression
The entropy of the codes usually serves as the rate loss in the recent l...
read it
-
Improving Semantic Segmentation via Self-Training
Deep learning usually achieves the best results with complete supervisio...
read it
-
ResNeSt: Split-Attention Networks
While image classification models have recently continued to advance, mo...
read it
-
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
We introduce AutoGluon-Tabular, an open-source AutoML framework that req...
read it
-
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
We present GluonCV and GluonNLP, the deep learning toolkits for computer...
read it
-
A Unified Optimization Approach for CNN Model Inference on Integrated GPUs
Modern deep learning applications urge to push the model inference takin...
read it
-
Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression
It has long been understood that precisely estimating the probabilistic ...
read it
-
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
With an increasing demand for training powers for deep learning algorith...
read it
-
Language Models with Transformers
The Transformer architecture is superior to RNN-based models in computat...
read it
-
Learning Content-Weighted Deep Image Compression
Learning-based lossy image compression usually involves the joint optimi...
read it
-
Bag of Freebies for Training Object Detection Neural Networks
Comparing with enormous research achievements targeting better image cla...
read it
-
Bag of Tricks for Image Classification with Convolutional Neural Networks
Much of the recent progress made in image classification research can be...
read it
-
Optimizing CNN Model Inference on CPUs
The popularity of Convolutional Neural Network (CNN) models and the ubiq...
read it
-
Approximate Distribution Matching for Sequence-to-Sequence Learning
Sequence-to-Sequence models were introduced to tackle many real-life pro...
read it
-
Style Transfer as Unsupervised Machine Translation
Language style transferring rephrases text with specific stylistic attri...
read it
-
Regularizing Neural Machine Translation by Target-bidirectional Agreement
Although Neural Machine Translation (NMT) has achieved remarkable progre...
read it
-
Triangular Architecture for Rare Language Translation
Neural Machine Translation (NMT) performs poor on the low-resource langu...
read it
-
Achieving Human Parity on Automatic Chinese to English News Translation
Machine translation has made rapid advances in recent years. Millions of...
read it
-
Joint Training for Neural Machine Translation Models with Monolingual Data
Monolingual data have been demonstrated to be helpful in improving trans...
read it
-
Shift-Net: Image Inpainting via Deep Feature Rearrangement
Deep convolutional networks (CNNs) have exhibited their potential in ima...
read it
-
Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression
Arithmetic encoding is an essential class of coding techniques which hav...
read it
-
Generative Bridging Network in Neural Sequence Prediction
Maximum Likelihood Estimation (MLE) suffers from data sparsity problem i...
read it
-
Learning Convolutional Networks for Content-weighted Image Compression
Lossy image compression is generally formulated as a joint rate-distorti...
read it
-
Deep Identity-aware Transfer of Facial Attributes
This paper presents a Deep convolutional network model for Identity-Awar...
read it
-
Convolutional Network for Attribute-driven and Identity-preserving Human Face Generation
This paper focuses on the problem of generating human face pictures from...
read it
-
Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model
Neural machine translation has shown very promising results lately. Most...
read it
-
Data Driven Resource Allocation for Distributed Learning
In distributed machine learning, data is dispatched to multiple machines...
read it
-
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
MXNet is a multi-language machine learning (ML) library to ease the deve...
read it
-
High Performance Latent Variable Models
Latent variable models have accumulated a considerable amount of interes...
read it
-
AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization
We study distributed stochastic convex optimization under the delayed gr...
read it
-
Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning
Distributed computing excels at processing large scale data, but the com...
read it
-
Empirical Evaluation of Rectified Activations in Convolutional Network
In this paper we investigate the performance of different types of recti...
read it
-
Beyond Word-based Language Model in Statistical Machine Translation
Language model is one of the most important modules in statistical machi...
read it