
What causes the test error? Going beyond biasvariance via ANOVA
Modern machine learning methods are often overparametrized, allowing ada...
DeltaGrad: Rapid retraining of machine learning models
Machine learning models are not static and may need to be retrained on s...
How to reduce dimension with PCA and random projections?
In our "big data" age, the size and complexity of data is steadily incre...
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
We study the implicit regularization of minibatch stochastic gradient d...
Limiting Spectrum of Randomized Hadamard Transform and Optimal Iterative Sketching Methods
We provide an exact analysis of the limiting spectrum of matrices random...
Implicit Regularization of Normalization Methods
Normalization methods such as batch normalization are commonly used in o...
Ridge Regression: Structure, CrossValidation, and Sketching
We study the following three fundamental problems about ridge regression...
Invariance reduces Variance: Understanding Data Augmentation in Deep Learning and Beyond
Many complex deep learning models have found success by exploiting symme...
Oneshot distributed ridge regression in high dimensions
In many areas, practitioners need to analyze large datasets that challen...
A New Theory for Sketching in Linear Regression
Large datasets create opportunities as well as analytic challenges. A re...
Distributed linear regression by averaging
Modern massive datasets pose an enormous computational burden to practit...
Robust Inference Under Heteroskedasticity via the Hadamard Estimator
Drawing statistical inferences from large datasets in a modelrobust way...
Flexible Multiple Testing with the FACT Algorithm
Modern highthroughput science often leads to multiple testing problems:...
Deterministic parallel analysis
Factor analysis is widely used in many application areas. The first step...
