
Gradient descent follows the regularization path for general losses
Recent work across many machine learning disciplines has highlighted tha...
read it

Directional convergence and alignment in deep learning
In this paper, we show that although the minimizers of crossentropy and...
read it

Neural tangent kernels, transportation mappings, and universal approximation
This paper establishes rates of universal approximation for the shallow ...
read it

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Recent work has revealed that overparameterized networks trained by grad...
read it

Approximation power of random neural networks
This paper investigates the approximation power of three types of random...
read it

A refined primaldual analysis of the implicit bias
Recent work shows that gradient descent on linearly separable data is im...
read it

A gradual, semidiscrete approach to generative network training via explicit wasserstein minimization
This paper provides a simple procedure to fit generative networks to tar...
read it

SizeNoise Tradeoffs in Generative Networks
This paper investigates the ability of generative networks to convert th...
read it

Gradient descent aligns the layers of deep linear networks
This paper establishes risk convergence and asymptotic weight matrix ali...
read it

Risk and parameter convergence of logistic regression
The logistic loss is strictly convex and does not attain its infimum; co...
read it

Social Welfare and Profit Maximization from Revealed Preferences
Consider the seller's problem of finding "optimal" prices for her (divis...
read it

Spectrallynormalized margin bounds for neural networks
This paper presents a marginbased multiclass generalization bound for n...
read it

Neural networks and rational functions
Neural networks and rational functions efficiently approximate each othe...
read it

Nonconvex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Sto...
read it

Benefits of depth in neural networks
For any positive integer k, there exist neural networks with Θ(k^3) laye...
read it

Representation Benefits of Deep Feedforward Networks
This note provides a family of classification problems, indexed by a pos...
read it

Convex Risk Minimization and Conditional Probability Estimation
This paper proves, in very general settings, that convex risk minimizati...
read it

Scalable Nonlinear Learning with Adaptive Polynomial Expansions
Can we effectively learn a nonlinear representation in time comparable t...
read it

Boosting with the Logistic Loss is Consistent
This manuscript provides optimization guarantees, generalization bounds,...
read it

Margins, Shrinkage, and Boosting
This manuscript shows that AdaBoost and its immediate variants can produ...
read it

Dirichlet draws are sparse with high probability
This note provides an elementary proof of the folklore fact that draws f...
read it

Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parame...
read it

Agglomerative Bregman Clustering
This manuscript develops the theory of agglomerative clustering with Bre...
read it
Matus Telgarsky
is this you? claim profile
Assistant Professor, Computer Science at University of Illinois, UrbanaChampaign