
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Recent work has highlighted the role of initialization scale in determin...
read it

The MinMax Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication
We resolve the minmax complexity of distributed stochastic convex optim...
read it

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
We provide a detailed asymptotic study of gradient flow trajectories and...
read it

Minibatch vs Local SGD for Heterogeneous Distributed Learning
We analyze Local SGD (aka parallel or federated SGD) and Minibatch SGD i...
read it

Mirrorless Mirror Descent: A More Natural Discretization of Riemannian Gradient Flow
We present a direct (primal only) derivation of Mirror Descent as a "par...
read it

Kernel and Rich Regimes in Overparametrized Models
A recent line of work studies overparametrized neural networks in the "k...
read it

Is Local SGD Better than Minibatch SGD?
We study local SGD (also known as parallel SGD and federated averaging),...
read it

Lower Bounds for NonConvex Stochastic Optimization
We lower bound the complexity of finding ϵstationary points (with gradi...
read it

The gradient complexity of linear regression
We investigate the computational complexity of several basic linear alge...
read it

Open Problem: The Oracle Complexity of Convex Optimization with Limited Memory
We note that known methods achieving the optimal oracle complexity for f...
read it

Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis
We design a general framework for answering adaptive statistical queries...
read it

Kernel and Deep Regimes in Overparametrized Models
A recent line of work studies overparametrized neural networks in the "k...
read it

The Complexity of Making the Gradient Small in Stochastic Convex Optimization
We give nearly matching upper and lower bounds on the oracle complexity ...
read it

Training WellGeneralizing Classifiers for Fairness Metrics and Other DataDependent Constraints
Classifiers can be trained with datadependent constraints to satisfy fa...
read it

Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization
We suggest a general oraclebased framework that captures different para...
read it

The Everlasting Database: Statistical Validity at a Fair Price
The problem of handling adaptivity in data analysis, intentional or not,...
read it

Implicit Regularization in Matrix Factorization
We study implicit regularization when optimizing an underdetermined quad...
read it

Tight Complexity Bounds for Optimizing Composite Objectives
We provide tight upper and lower bounds on the complexity of minimizing ...
read it
Blake Woodworth
is this you? claim profile