
Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA
It can be argued that finding an interpretable lowdimensional represent...
read it

Stochastic Gradient DescentAscent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Cocoercivity
Two of the most prominent algorithms for solving unconstrained smooth ga...
read it

Structured Convolutional Kernel Networks for Airline Crew Scheduling
Motivated by the needs from an airline crew scheduling application, we i...
read it

Repurposing Pretrained Models for Robust Outofdomain FewShot Learning
Modelagnostic metalearning (MAML) is a popular method for fewshot lea...
read it

Online Adversarial Attacks
Adversarial attacks expose important vulnerabilities of deep learning mo...
read it

SVRG Meets AdaGrad: Painless Variance Reduction
Variance reduction (VR) methods for finitesum minimization typically re...
read it

GeometryAware Universal MirrorProx
Mirrorprox (MP) is a wellknown algorithm to solve variational inequali...
read it

On the Convergence of Continuous Constrained Optimization for Structure Learning
Structure learning of directed acyclic graphs (DAGs) is a fundamental pr...
read it

Machine Learning in Airline Crew Pairing to Construct Initial Clusters for Dynamic Constraint Aggregation
The crew pairing problem (CPP) is generally modelled as a set partitioni...
read it

Flightconnection Prediction for Airline Crew Scheduling to Construct Initial Clusters for OR Optimizer
We present a case study of using machine learning classification algorit...
read it

Implicit Regularization in Deep Learning: A View from Function Space
We approach the problem of implicit regularization in deep learning from...
read it

Stochastic Hamiltonian Gradient Methods for Smooth Games
The success of adversarial formulations in machine learning has brought ...
read it

Differentiable Causal Discovery from Interventional Data
Discovering causal relationships in data is a challenging task that invo...
read it

Adversarial Example Games
The existence of adversarial examples capable of fooling trained neural ...
read it

Adaptive Gradient Methods Converge Faster with OverParameterization (and you can do a linesearch)
As adaptive gradient methods are typically used for training overparame...
read it

An Analysis of the Adaptation Speed of Causal Models
We consider the problem of discovering the causal process that generated...
read it

Stochastic Polyak Stepsize for SGD: An Adaptive Learning Rate for Fast Convergence
We propose a stochastic variant of the classical Polyak stepsize (Polya...
read it

Accelerating Smooth Games by Manipulating Spectral Shapes
We use matrix iteration theory to characterize acceleration in smooth ga...
read it

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
We consider stochastic second order methods for minimizing stronglyconv...
read it

GEAR: GeometryAware Rényi Information
Shannon's seminal theory of information has been of paramount importance...
read it

A Tight and Unified Analysis of Extragradient for a Whole Spectrum of Differentiable Games
We consider differentiable games: multiobjective minimization problems,...
read it

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks
Generative adversarial networks have been very successful in generative ...
read it

GradientBased Neural DAG Learning
We propose a novel scorebased approach to learning a directed acyclic g...
read it

Painless Stochastic Gradient: Interpolation, LineSearch, and Convergence Rates
Recent works have shown that stochastic gradient descent (SGD) achieves ...
read it

Implicit Regularization of Discrete Gradient Dynamics in Deep Linear Neural Networks
When optimizing overparameterized models, such as deep neural networks,...
read it

Reducing Noise in GAN Training with Variance Reduced Extragradient
Using large minibatches when training generative adversarial networks (...
read it

Centroid Networks for FewShot Clustering and Unsupervised FewShot Classification
Traditional clustering algorithms such as Kmeans rely heavily on the na...
read it

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information
This paper offers a methodological contribution at the intersection of m...
read it

Quantifying Learning Guarantees for Convex but Inconsistent Surrogates
We study consistency properties of machine learning methods based on min...
read it

A Modern Take on the BiasVariance Tradeoff in Neural Networks
We revisit the biasvariance tradeoff for neural networks in light of mo...
read it

Scattering Networks for Hybrid Representation Learning
Scattering networks are a class of designed Convolutional Neural Network...
read it

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning
The paper provides a methodological contribution at the intersection of ...
read it

Negative Momentum for Improved Game Dynamics
Games generalize the optimization paradigm by introducing different obje...
read it

FrankWolfe Splitting via Augmented Lagrangian Method
Minimizing a function over an intersection of convex sets is an importan...
read it

A Variational Inequality Perspective on Generative Adversarial Nets
Stability has been a recurrent issue in training generative adversarial ...
read it

A3T: Adversarially Augmented Adversarial Training
Recent research showed that deep neural networks are highly sensitive to...
read it

Improved asynchronous parallel optimization analysis for stochastic incremental methods
As datasets continue to increase in size and multicore computer archite...
read it

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
This work investigates training Conditional Random Fields (CRF) by Stoch...
read it

Parametric Adversarial Divergences are Good Task Losses for Generative Modeling
Generative modeling of high dimensional data like images is a notoriousl...
read it

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Due to their simplicity and excellent performance, parallel asynchronous...
read it

A Closer Look at Memorization in Deep Networks
We examine the role of memorization in deep learning, drawing connection...
read it

SEARNN: Training RNNs with GlobalLocal Losses
We propose SEARNN, a novel training algorithm for recurrent neural netwo...
read it

On Structured Prediction Theory with Calibrated Convex Surrogate Losses
We provide novel theoretical insights on structured prediction in the co...
read it

Joint Discovery of Object States and Manipulation Actions
Many human activities involve object manipulations aiming to modify the ...
read it

FrankWolfe Algorithms for Saddle Point Problems
We extend the FrankWolfe (FW) optimization algorithm to solve constrain...
read it

Convergence Rate of FrankWolfe for NonConvex Objectives
We give a simple proof that the FrankWolfe algorithm obtains a stationa...
read it

ASAGA: Asynchronous Parallel SAGA
We describe ASAGA, an asynchronous parallel version of the incremental g...
read it

Minding the Gaps for Block FrankWolfe Optimization of Structured SVMs
In this paper, we propose several improvements on the blockcoordinate F...
read it

PACBayesian Theory Meets Bayesian Inference
We exhibit a strong link between frequentist PACBayesian risk bounds an...
read it

Beyond CCA: Moment Matching for MultiView Models
We introduce three novel semiparametric extensions of probabilistic can...
read it