We present a new general-purpose algorithm for learning classes of
[0,1]...
We consider Sharpness-Aware Minimization (SAM), a gradient-based optimiz...
We bound the excess risk of interpolating deep linear networks trained u...
van Rooyen et al. introduced a notion of convex loss functions being rob...
We prove a lower bound on the excess risk of sparse interpolating proced...
The recent success of neural network models has shone light on a rather
...
The Neural Tangent Kernel (NTK) is the wide-network limit of a kernel de...
We establish conditions under which gradient descent applied to fixed-wi...
We study the training of finite-width two-layer smoothed ReLU networks f...
We consider bounds on the generalization performance of the least-norm l...
We prove bounds on the population risk of the maximum margin algorithm f...
We study the convergence of gradient descent (GD) and stochastic gradien...
We consider the problem of sampling from a strongly log-concave density ...
The phenomenon of benign overfitting is one of the key mysteries uncover...
We prove bounds on the generalization error of convolutional networks. T...
We analyze the joint probability distribution on the lengths of the vect...
We study density estimation for classes of shift-invariant distributions...
We study the learnability of sums of independent integer random variable...
We characterize the singular values of the linear transformation associa...
We show that any smooth bi-Lipschitz h can be represented exactly as a
c...
We analyze algorithms for approximating a function f(x) = Φ x mapping
^d...
We analyze dropout in deep networks with rectified linear units and the
...
Dropout is a simple but effective technique for learning in neural netwo...
We introduce a new approach for designing computationally efficient lear...
We provide new results concerning label efficient, polynomial time, pass...