Difan Zou

research

∙ 06/20/2023

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks

We study the implicit bias of batch normalization trained by gradient de...

2 Yuan Cao, et al. ∙

research

∙ 03/31/2023

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

Gradient regularization, as described in <cit.>, is a highly effective t...

0 Xuran Meng, et al. ∙

research

∙ 03/15/2023

The Benefits of Mixup for Feature Learning

Mixup, a simple data augmentation method that randomly mixes two data po...

0 Difan Zou, et al. ∙

research

∙ 03/03/2023

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples

This paper considers the problem of learning a single ReLU neuron with s...

1 Jingfeng Wu, et al. ∙

research

∙ 08/03/2022

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

We study linear regression under covariate shift, where the marginal dis...

4 Jingfeng Wu, et al. ∙

research

∙ 03/07/2022

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime

Stochastic gradient descent (SGD) has achieved great success due to its ...

4 Difan Zou, et al. ∙

research

∙ 10/12/2021

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Stochastic gradient descent (SGD) has been demonstrated to generalize we...

5 Jingfeng Wu, et al. ∙

research

∙ 08/25/2021

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

Adaptive gradient methods such as Adam have gained increasing popularity...

17 Difan Zou, et al. ∙

research

∙ 08/10/2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Stochastic gradient descent (SGD) exhibits strong algorithmic regulariza...

0 Difan Zou, et al. ∙

research

∙ 06/25/2021

Self-training Converts Weak Learners to Strong Learners in Mixture Models

We consider a binary classification problem when the data comes from a m...

2 Spencer Frei, et al. ∙

research

∙ 04/19/2021

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

We analyze the properties of adversarial training for learning adversari...

8 Difan Zou, et al. ∙

research

∙ 03/23/2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

There is an increasing realization that algorithmic inductive biases are...

9 Difan Zou, et al. ∙

research

∙ 11/04/2020

Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate

Understanding the algorithmic regularization effect of stochastic gradie...

14 Jingfeng Wu, et al. ∙

research

∙ 10/19/2020

Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling

We establish a new convergence analysis of stochastic gradient Langevin ...

2 Difan Zou, et al. ∙

research

∙ 03/02/2020

On the Global Convergence of Training Deep Linear ResNets

We study the convergence of gradient descent (GD) and stochastic gradien...

15 Difan Zou, et al. ∙

research

∙ 11/27/2019

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

A recent line of research on deep learning focuses on the extremely over...

28 Zixiang Chen, et al. ∙

research

∙ 11/17/2019

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

Graph convolutional networks (GCNs) have recently received wide attentio...

10 Difan Zou, et al. ∙

research

∙ 11/02/2019

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo

As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradi...

17 Bao Wang, et al. ∙

research

∙ 06/11/2019

An Improved Analysis of Training Over-parameterized Deep Neural Networks

A recent line of research has shown that gradient-based algorithms with ...

0 Difan Zou, et al. ∙

research

∙ 11/21/2018

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

We study the problem of training deep neural networks with Rectified Lin...

18 Difan Zou, et al. ∙

research

∙ 02/13/2018

Stochastic Variance-Reduced Hamilton Monte Carlo Methods

We propose a fast stochastic Hamilton Monte Carlo (HMC) method, for samp...

0 Difan Zou, et al. ∙

research

∙ 12/11/2017

Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently

We propose a family of nonconvex optimization algorithms that are able t...

0 Yaodong Yu, et al. ∙

Difan Zou

Featured Co-authors

Sign in with Google

Consider DeepAI Pro