
Weighted Optimization: better generalization by smoother interpolation
We provide a rigorous analysis of how implicit bias towards smooth inter...
read it

Implicit Regularization of Normalization Methods
Normalization methods such as batch normalization are commonly used in o...
read it

Faster JohnsonLindenstrauss Transforms via Kronecker Products
The Kronecker product is an important matrix operation with a wide range...
read it

Linear Convergence of Adaptive Stochastic Gradient Descent
We prove that the norm version of the adaptive stochastic gradient metho...
read it

Bias of Homotopic Gradient Descent for the Hinge Loss
Gradient descent is a simple and widely used optimization method for mac...
read it

AdaOja: Adaptive Learning Rates for Streaming PCA
Oja's algorithm has been the cornerstone of streaming methods in Princip...
read it

Global Convergence of Adaptive Gradient Methods for An Overparameterized Neural Network
Adaptive gradient methods like AdaGrad are widely used in optimizing neu...
read it

Recovery guarantees for polynomial approximation from dependent data with outliers
Learning nonlinear systems from noisy, limited, and/or dependent data i...
read it

Compressed sensing with a jackknife and a bootstrap
Compressed sensing proposes to reconstruct more degrees of freedom in a ...
read it

AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization
Adaptive gradient methods such as AdaGrad and its variants update the st...
read it

Extracting structured dynamical systems using sparse optimization with very few samples
Learning governing equations allows for deeper understanding of the stru...
read it

Greedy Variance Estimation for the LASSO
Recent results have proven the minimax optimality of LASSO and related a...
read it

WNGrad: Learn the Learning Rate in Gradient Descent
Adjusting the learning rate schedule in stochastic gradient methods is a...
read it

A polynomialtime relaxation of the GromovHausdorff distance
The GromovHausdorff distance provides a metric on the set of isometry c...
read it

Clustering subgaussian mixtures by semidefinite programming
We introduce a modelfree relaxandround algorithm for kmeans clusteri...
read it

The local convexity of solving systems of quadratic equations
This paper considers the recovery of a rank r positive semidefinite matr...
read it

Onebit compressive sensing with norm estimation
Consider the recovery of an unknown signal x from quantized linear measu...
read it

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
We obtain an improved finitesample guarantee on the linear convergence ...
read it

Recovery guarantees for exemplarbased clustering
For a certain class of distributions, we prove that the linear programmi...
read it

Completing Any Lowrank Matrix, Provably
Matrix completion, i.e., the exact and provable recovery of a lowrank m...
read it

Nearoptimal compressed sensing guarantees for total variation minimization
Consider the problem of reconstructing a multidimensional signal from an...
read it

Stable and robust sampling strategies for compressive imaging
In many signal processing applications, one wishes to acquire images tha...
read it
Rachel Ward
is this you? claim profile
Assistant Professor of Mathematics at University of Texas at Austin, Postdoctoral Researcher at Courant Institute, New York University from 20092011