
Adaptive Newton Sketch: Lineartime Optimization with Quadratic Convergence and Effective Hessian Dimensionality
We propose a randomized algorithm with quadratic convergence rate for co...
Training Quantized Neural Networks to Global Optimality via Semidefinite Programming
Neural networks (NNs) have been extremely successful across many tasks i...
Fast Convex Quadratic Optimization Solvers with Adaptive Sketchingbased Preconditioners
We consider leastsquares problems with quadratic regularization and pro...
Distributed Learning and Democratic Embeddings: PolynomialTime Source Coding Schemes Can Achieve Minimax Lower Bounds for Distributed Gradient Descent under Communication Cons
In this work, we consider the distributed optimization setting where inf...
Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization
Batch Normalization (BN) is a commonly used technique to accelerate and ...
Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully PolynomialTime
The training of twolayer neural networks with nonlinear activation func...
Vectoroutput ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomialtime Algorithms
We describe the convex semiinfinite dual of the twolayer vectoroutput...
Adaptive and Oblivious Randomized Subspace Methods for HighDimensional Optimization: Sharp Analysis and Lower Bounds
We propose novel randomized optimization methods for highdimensional co...
Convex Regularization Behind Neural Reconstruction
Neural networks have shown tremendous potential for reconstructing high...
Approximate Weighted CR Coded Matrix Multiplication
One of the most common, but at the same time expensive operations in lin...
Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization
In distributed second order optimization, a standard strategy is to aver...
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two and ThreeLayer Networks in Polynomial Time
We study training of Convolutional Neural Networks (CNNs) with ReLU acti...
Lower Bounds and a NearOptimal Shrinkage Estimator for Least Squares using Random Projections
In this work, we consider the deterministic optimization using random pr...
All Local Minima are Global for TwoLayer ReLU Neural Networks: The Hidden Convex Optimization Landscape
We are interested in twolayer ReLU neural networks from an optimization...
Effective Dimension Adaptive Sketching Methods for Faster Regularized LeastSquares Optimization
We propose a new randomized algorithm for solving L2regularized leasts...
Global Multiclass Classification from Heterogeneous Local Models
Multiclass classification problems are most often solved by either train...
Straggler Robust Distributed Matrix Inverse Approximation
A cumbersome operation in numerical analysis and linear algebra, optimiz...
Convex Geometry and Duality of Overparameterized Neural Networks
We develop a convex analytic framework for ReLU neural networks which el...
Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory
Batch Normalization (BatchNorm) is commonly used in Convolutional Neural...
Neural Networks are Convex Regularizers: Exact Polynomialtime Convex Optimization Formulations for TwoLayer Networks
We develop exact representations of two layer neural networks with recti...
Convex Duality of Deep Neural Networks
We study regularized deep neural networks and introduce an analytic fram...
Optimal Randomized FirstOrder Methods for LeastSquares Problems
We provide an exact analysis of a class of randomized algorithms for sol...
Distributed Averaging Methods for Randomized Second Order Optimization
We consider distributed optimization problems where forming the Hessian ...
Distributed Sketching Methods for Privacy Preserving Regression
In this work, we study distributed sketching methods for large scale reg...
Global Convergence of Frank Wolfe on One Hidden Layer Networks
We derive global convergence bounds for the Frank Wolfe algorithm when t...
Limiting Spectrum of Randomized Hadamard Transform and Optimal Iterative Sketching Methods
We provide an exact analysis of the limiting spectrum of matrices random...
Weighted Gradient Coding with Leverage Score Sampling
A major hurdle in machine learning is scalability to massive datasets. A...
Regularized Momentum Iterative Hessian Sketch for Large Scale Linear System of Equations
In this article, Momentum Iterative Hessian Sketch (MIHS) techniques, a...
Faster Least Squares Optimization
We investigate randomized methods for solving overdetermined linear leas...
Distributed BlackBox Optimization via Error Correcting Codes
We introduce a novel distributed derivativefree optimization framework ...
HighDimensional Optimization in Adaptive Random Subspaces
We propose a new randomized optimization method for highdimensional pro...
Polar Coded Distributed Matrix Multiplication
We propose a polar coding mechanism for distributed matrix multiplicatio...
Convex Relaxations of Convolutional Neural Nets
We propose convex relaxations for convolutional neural nets with one hid...
Newton Sketch: A Lineartime Optimization Algorithm with LinearQuadratic Convergence
We propose a randomized secondorder method for optimization known as th...
Randomized sketches for kernels: Fast and optimal nonparametric regression
Kernel ridge regression (KRR) is a standard method for performing nonpa...
Iterative Hessian sketch: Fast and accurate solution approximation for constrained leastsquares
We study randomized sketching methods for approximately solving leastsq...
Randomized Sketches of Convex Programs with Sharp Guarantees
Random projection (RP) is a classical technique for reducing storage and...
