
PositiveNegative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
It is wellknown that stochastic gradient noise (SGN) acts as implicit r...
read it

Amata: An Annealing Mechanism for Adversarial Training Acceleration
Despite the empirical success in various domains, it has been revealed t...
read it

Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Knowledge distillation is a strategy of training a student network with ...
read it

Neural Approximate Sufficient Statistics for Implicit Models
We consider the fundamental problem of how to automatically construct su...
read it

Informative Dropout for Robust Representation Learning: A Shapebias Perspective
Convolutional Neural Networks (CNNs) are known to rely more on local tex...
read it

Spherical Motion Dynamics of Deep Neural Networks with Batch Normalization and Weight Decay
We comprehensively reveal the learning dynamics of deep neural networks ...
read it

Classify and Generate Reciprocally: Simultaneous PositiveUnlabelled Learning and Conditional Generation with Extra Data
The scarcity of classlabeled data is a ubiquitous bottleneck in a wide ...
read it

Global Robustness Verification Networks
The wide deployment of deep neural networks, though achieving great succ...
read it

BlackBox Certification with Randomized Smoothing: A Functional Optimization Based Framework
Randomized classifiers have been shown to provide a promising approach f...
read it

Patchlevel Neighborhood Interpolation: A General and Effective Graphbased Regularization Strategy
Regularization plays a crucial role in machine learning models, especial...
read it

Towards Making Deep Transfer Learning Never Hurt
Transfer learning have been frequently used to improve deep neural netwo...
read it

Spatiotemporal Manifold Learning for Human Motions via Longhorizon Modeling
Datadriven modeling of human motions is ubiquitous in computer graphics...
read it

AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models
The design of deep graph models still remains to be investigated and the...
read it

The Multiplicative Noise in Stochastic Gradient Descent: DataDependent Regularization, Continuous and Discrete Approximation
The randomness in Stochastic Gradient Descent (SGD) is considered to pla...
read it

Differentiable Neural Architecture Search via Proximal Iterations
Neural architecture search (NAS) recently attracts much research attenti...
read it

On the Learning Dynamics of Twolayer Nonlinear Convolutional Neural Networks
Convolutional neural networks (CNNs) have achieved remarkable performanc...
read it

Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural n...
read it

Bayesian Optimized Continual Learning with Attention Mechanism
Though neural networks have achieved much progress in various applicatio...
read it

You Only Propagate Once: Accelerating Adversarial Training Using Maximal Principle
Deep learning achieves stateoftheart results in many areas. However r...
read it

You Only Propagate Once: Painless Adversarial Training Using Maximal Principle
Deep learning achieves stateoftheart results in many areas. However r...
read it

STUNet: A SpatioTemporal UNetwork for Graphstructured Time Series Modeling
The spatiotemporal graph learning is becoming an increasingly important...
read it

3D Graph Convolutional Networks with Temporal Graphs: A Spatial Information Free Framework For Traffic Forecasting
Spatiotemporal prediction plays an important role in many application a...
read it

Virtual Adversarial Training on Graph Convolutional Networks in Node Classification
The effectiveness of Graph Convolutional Networks (GCNs) has been demons...
read it

MultiStage SelfSupervised Learning for Graph Convolutional Networks
Graph Convolutional Networks(GCNs) play a crucial role in graph learning...
read it

Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN
Deep neural networks have been widely deployed in various machine learni...
read it

Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors
Most previous works usually explained adversarial examples from several ...
read it

Quasipotential as an implicit regularizer for the loss function in the stochastic gradient descent
We interpret the variational inference of the Stochastic Gradient Descen...
read it

TangentNormal Adversarial Regularization for Semisupervised Learning
The everincreasing size of modern datasets combined with the difficulty...
read it

Neural Control Variates for Variance Reduction
In statistics and machine learning, approximation of an intractable inte...
read it

Reinforced Continual Learning
Most artificial intelligence models have limiting ability to solve new t...
read it

The Regularization Effects of Anisotropic Noise in Stochastic Gradient Descent
Understanding the generalization of deep learning has raised lots of con...
read it

Understanding and Enhancing the Transferability of Adversarial Examples
Stateoftheart deep neural networks are known to be vulnerable to adve...
read it

Spatiotemporal Graph Convolutional Neural Network: A Deep Learning Framework for Traffic Forecasting
The goal of traffic forecasting is to predict the future vital indicator...
read it

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
It is widely observed that deep learning models with learned parameters ...
read it

Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix
Distant supervision significantly reduces human efforts in building trai...
read it

Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
Minimizing nonconvex and highdimensional objective functions is challe...
read it

Stochastic Parallel Block Coordinate Descent for Largescale Saddle Point Problems
We consider convexconcave saddle point problems with a separable struct...
read it

CovarianceControlled Adaptive Langevin Thermostat for LargeScale Bayesian Sampling
Monte Carlo sampling for Bayesian posterior inference is a common approa...
read it

Adaptive Stochastic PrimalDual Coordinate Descent for Separable Saddle Point Problems
We consider a generic convexconcave saddle point problem with separable...
read it
Zhanxing Zhu
is this you? claim profile