
-
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
It is well-known that stochastic gradient noise (SGN) acts as implicit r...
read it
-
Amata: An Annealing Mechanism for Adversarial Training Acceleration
Despite the empirical success in various domains, it has been revealed t...
read it
-
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Knowledge distillation is a strategy of training a student network with ...
read it
-
Neural Approximate Sufficient Statistics for Implicit Models
We consider the fundamental problem of how to automatically construct su...
read it
-
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective
Convolutional Neural Networks (CNNs) are known to rely more on local tex...
read it
-
Spherical Motion Dynamics of Deep Neural Networks with Batch Normalization and Weight Decay
We comprehensively reveal the learning dynamics of deep neural networks ...
read it
-
Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data
The scarcity of class-labeled data is a ubiquitous bottleneck in a wide ...
read it
-
Global Robustness Verification Networks
The wide deployment of deep neural networks, though achieving great succ...
read it
-
Black-Box Certification with Randomized Smoothing: A Functional Optimization Based Framework
Randomized classifiers have been shown to provide a promising approach f...
read it
-
Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy
Regularization plays a crucial role in machine learning models, especial...
read it
-
Towards Making Deep Transfer Learning Never Hurt
Transfer learning have been frequently used to improve deep neural netwo...
read it
-
Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling
Data-driven modeling of human motions is ubiquitous in computer graphics...
read it
-
AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models
The design of deep graph models still remains to be investigated and the...
read it
-
The Multiplicative Noise in Stochastic Gradient Descent: Data-Dependent Regularization, Continuous and Discrete Approximation
The randomness in Stochastic Gradient Descent (SGD) is considered to pla...
read it
-
Differentiable Neural Architecture Search via Proximal Iterations
Neural architecture search (NAS) recently attracts much research attenti...
read it
-
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks
Convolutional neural networks (CNNs) have achieved remarkable performanc...
read it
-
Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural n...
read it
-
Bayesian Optimized Continual Learning with Attention Mechanism
Though neural networks have achieved much progress in various applicatio...
read it
-
You Only Propagate Once: Accelerating Adversarial Training Using Maximal Principle
Deep learning achieves state-of-the-art results in many areas. However r...
read it
-
You Only Propagate Once: Painless Adversarial Training Using Maximal Principle
Deep learning achieves state-of-the-art results in many areas. However r...
read it
-
ST-UNet: A Spatio-Temporal U-Network for Graph-structured Time Series Modeling
The spatio-temporal graph learning is becoming an increasingly important...
read it
-
3D Graph Convolutional Networks with Temporal Graphs: A Spatial Information Free Framework For Traffic Forecasting
Spatio-temporal prediction plays an important role in many application a...
read it
-
Virtual Adversarial Training on Graph Convolutional Networks in Node Classification
The effectiveness of Graph Convolutional Networks (GCNs) has been demons...
read it
-
Multi-Stage Self-Supervised Learning for Graph Convolutional Networks
Graph Convolutional Networks(GCNs) play a crucial role in graph learning...
read it
-
Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN
Deep neural networks have been widely deployed in various machine learni...
read it
-
Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors
Most previous works usually explained adversarial examples from several ...
read it
-
Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent
We interpret the variational inference of the Stochastic Gradient Descen...
read it
-
Tangent-Normal Adversarial Regularization for Semi-supervised Learning
The ever-increasing size of modern datasets combined with the difficulty...
read it
-
Neural Control Variates for Variance Reduction
In statistics and machine learning, approximation of an intractable inte...
read it
-
Reinforced Continual Learning
Most artificial intelligence models have limiting ability to solve new t...
read it
-
The Regularization Effects of Anisotropic Noise in Stochastic Gradient Descent
Understanding the generalization of deep learning has raised lots of con...
read it
-
Understanding and Enhancing the Transferability of Adversarial Examples
State-of-the-art deep neural networks are known to be vulnerable to adve...
read it
-
Spatio-temporal Graph Convolutional Neural Network: A Deep Learning Framework for Traffic Forecasting
The goal of traffic forecasting is to predict the future vital indicator...
read it
-
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
It is widely observed that deep learning models with learned parameters ...
read it
-
Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix
Distant supervision significantly reduces human efforts in building trai...
read it
-
Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
Minimizing non-convex and high-dimensional objective functions is challe...
read it
-
Stochastic Parallel Block Coordinate Descent for Large-scale Saddle Point Problems
We consider convex-concave saddle point problems with a separable struct...
read it
-
Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling
Monte Carlo sampling for Bayesian posterior inference is a common approa...
read it
-
Adaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems
We consider a generic convex-concave saddle point problem with separable...
read it