
Implicit Regularization in Deep Learning: A View from Function Space
We approach the problem of implicit regularization in deep learning from...
read it

Stochastic Hamiltonian Gradient Methods for Smooth Games
The success of adversarial formulations in machine learning has brought ...
read it

Differentiable Causal Discovery from Interventional Data
Discovering causal relationships in data is a challenging task that invo...
read it

Adversarial Example Games
The existence of adversarial examples capable of fooling trained neural ...
read it

Adaptive Gradient Methods Converge Faster with OverParameterization (and you can do a linesearch)
As adaptive gradient methods are typically used for training overparame...
read it

An Analysis of the Adaptation Speed of Causal Models
We consider the problem of discovering the causal process that generated...
read it

Stochastic Polyak Stepsize for SGD: An Adaptive Learning Rate for Fast Convergence
We propose a stochastic variant of the classical Polyak stepsize (Polya...
read it

Accelerating Smooth Games by Manipulating Spectral Shapes
We use matrix iteration theory to characterize acceleration in smooth ga...
read it

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
We consider stochastic second order methods for minimizing stronglyconv...
read it

GEAR: GeometryAware Rényi Information
Shannon's seminal theory of information has been of paramount importance...
read it

A Tight and Unified Analysis of Extragradient for a Whole Spectrum of Differentiable Games
We consider differentiable games: multiobjective minimization problems,...
read it

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks
Generative adversarial networks have been very successful in generative ...
read it

GradientBased Neural DAG Learning
We propose a novel scorebased approach to learning a directed acyclic g...
read it

Painless Stochastic Gradient: Interpolation, LineSearch, and Convergence Rates
Recent works have shown that stochastic gradient descent (SGD) achieves ...
read it

Implicit Regularization of Discrete Gradient Dynamics in Deep Linear Neural Networks
When optimizing overparameterized models, such as deep neural networks,...
read it

Reducing Noise in GAN Training with Variance Reduced Extragradient
Using large minibatches when training generative adversarial networks (...
read it

Centroid Networks for FewShot Clustering and Unsupervised FewShot Classification
Traditional clustering algorithms such as Kmeans rely heavily on the na...
read it

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information
This paper offers a methodological contribution at the intersection of m...
read it

Quantifying Learning Guarantees for Convex but Inconsistent Surrogates
We study consistency properties of machine learning methods based on min...
read it

A Modern Take on the BiasVariance Tradeoff in Neural Networks
We revisit the biasvariance tradeoff for neural networks in light of mo...
read it

Scattering Networks for Hybrid Representation Learning
Scattering networks are a class of designed Convolutional Neural Network...
read it

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning
The paper provides a methodological contribution at the intersection of ...
read it

Negative Momentum for Improved Game Dynamics
Games generalize the optimization paradigm by introducing different obje...
read it

FrankWolfe Splitting via Augmented Lagrangian Method
Minimizing a function over an intersection of convex sets is an importan...
read it

A Variational Inequality Perspective on Generative Adversarial Nets
Stability has been a recurrent issue in training generative adversarial ...
read it

A3T: Adversarially Augmented Adversarial Training
Recent research showed that deep neural networks are highly sensitive to...
read it

Improved asynchronous parallel optimization analysis for stochastic incremental methods
As datasets continue to increase in size and multicore computer archite...
read it

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
This work investigates training Conditional Random Fields (CRF) by Stoch...
read it

Parametric Adversarial Divergences are Good Task Losses for Generative Modeling
Generative modeling of high dimensional data like images is a notoriousl...
read it

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Due to their simplicity and excellent performance, parallel asynchronous...
read it

A Closer Look at Memorization in Deep Networks
We examine the role of memorization in deep learning, drawing connection...
read it

SEARNN: Training RNNs with GlobalLocal Losses
We propose SEARNN, a novel training algorithm for recurrent neural netwo...
read it

On Structured Prediction Theory with Calibrated Convex Surrogate Losses
We provide novel theoretical insights on structured prediction in the co...
read it

Joint Discovery of Object States and Manipulation Actions
Many human activities involve object manipulations aiming to modify the ...
read it

FrankWolfe Algorithms for Saddle Point Problems
We extend the FrankWolfe (FW) optimization algorithm to solve constrain...
read it

Convergence Rate of FrankWolfe for NonConvex Objectives
We give a simple proof that the FrankWolfe algorithm obtains a stationa...
read it

ASAGA: Asynchronous Parallel SAGA
We describe ASAGA, an asynchronous parallel version of the incremental g...
read it

Minding the Gaps for Block FrankWolfe Optimization of Structured SVMs
In this paper, we propose several improvements on the blockcoordinate F...
read it

PACBayesian Theory Meets Bayesian Inference
We exhibit a strong link between frequentist PACBayesian risk bounds an...
read it

Beyond CCA: Moment Matching for MultiView Models
We introduce three novel semiparametric extensions of probabilistic can...
read it

On the Global Linear Convergence of FrankWolfe Optimization Variants
The FrankWolfe (FW) optimization algorithm has lately regained popular...
read it

Barrier FrankWolfe for Marginal Inference
We introduce a globallyconvergent algorithm for optimizing the treerew...
read it

Rethinking LDA: moment matching for discrete ICA
We consider moment matching techniques for estimation in Latent Dirichle...
read it

Unsupervised Learning from Narrated Instruction Videos
We address the problem of automatically learning the main steps to compl...
read it

Variance Reduced Stochastic Gradient Descent with Neighbors
Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...
read it

Sequential Kernel Herding: FrankWolfe Optimization for Particle Filtering
Recently, the FrankWolfe optimization algorithm was suggested as a proc...
read it

On Pairwise Costs for Network Flow MultiObject Tracking
Multiobject tracking has been recently approached with the mincost net...
read it

SAGA: A Fast Incremental Gradient Method With Support for NonStrongly Convex Composite Objectives
In this work we introduce a new optimisation method called SAGA in the s...
read it

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
In this note, we present a new averaging technique for the projected sto...
read it

BlockCoordinate FrankWolfe Optimization for Structural SVMs
We propose a randomized blockcoordinate variant of the classic FrankWo...
read it