
Gradient Descent Provably Optimizes Overparameterized Neural Networks
One of the mystery in the success of neural networks is randomly initial...
A Deep Reinforcement Learning Approach for Global Routing
Global routing has been a historically challenging problem in electronic...
ProBO: a Framework for Using Probabilistic Programming in Bayesian Optimization
Optimizing an expensivetoquery function is a common task in science an...
Competencebased Curriculum Learning for Neural Machine Translation
Current stateoftheart NMT systems use large neural networks that are ...
Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly
Bayesian Optimisation (BO), refers to a suite of techniques for global o...
LucidDream: Controlled TemporallyConsistent DeepDream on Videos
In this work, we aim to propose a set of techniques to improve the contr...
Unsupervised Program Synthesis for Images using TreeStructured LSTM
Program synthesis has recently emerged as a promising approach to the im...
Developing Creative AI to Generate Sculptural Objects
We explore the intersection of human and machine creativity by generatin...
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
While graph kernels (GKs) are easy to train and enjoy provable theoretic...
Cautious Deep Learning
Most classifiers operate by selecting the maximum of an estimate of the ...
Estimating Cosmological Parameters from the Dark Matter Distribution
A grand challenge of the 21st century cosmology is to accurately estimat...
A Generic Approach for Escaping Saddle points
A central challenge to using firstorder methods for optimizing nonconve...
On the Reconstruction Risk of Convolutional Sparse Dictionary Learning
Sparse dictionary learning (SDL) has become a popular method for adaptiv...
Recurrent Estimation of Distributions
This paper presents the recurrent estimation of distributions (RED) for ...
Gradient Descent Can Take Exponential Time to Escape Saddle Points
Although gradient descent (GD) almost always escapes saddle points asymp...
Asynchronous Parallel Bayesian Optimisation via Thompson Sampling
We design and analyse variations of the classical Thompson sampling (TS)...
MMD GAN: Towards Deeper Understanding of Moment Matching Network
Generative moment matching network (GMMN) is a deep generative model tha...
Datadriven Random Fourier Features using Stein Effect
Largescale kernel approximation is an important problem in machine lear...
Multifidelity Bayesian Optimisation with Continuous Approximations
Bandit methods for blackbox optimisation, such as Bayesian optimisation...
Deep Sets
In this paper, we study the problem of designing objective functions for...
The Statistical Recurrent Unit
Sophisticated gated recurrent neural network architectures like LSTMs an...
Equivariance Through ParameterSharing
We propose to study equivariance in deep neural networks through paramet...
Nonparanormal Information Estimation
We study the problem of using i.i.d. samples from an unknown multivariat...
Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning
A common problem in disciplines of applied Statistics research such as A...
Deep Learning with Sets and Point Clouds
We introduce a simple permutation equivariant layer for deep learning wi...
Annealing Gaussian into ReLU: a New Sampling Strategy for LeakyReLU RBM
Restricted Boltzmann Machine (RBM) is a bipartite graphical model that i...
One Network to Solve Them All  Solving Linear Inverse Problems using Deep Projection Models
While deep learning methods have achieved stateoftheart performance i...
Enabling Dark Energy Science with Deep Generative Models of Galaxy Images
Understanding the nature of dark energy, the mysterious force driving th...
AIDE: Fast and Communication Efficient Distributed Optimization
In this paper, we present two new communicationefficient methods for di...
Stochastic FrankWolfe Methods for Nonconvex Optimization
We study FrankWolfe methods for nonconvex stochastic and finitesum opt...
FiniteSample Analysis of Fixedk Nearest Neighbor Density Functional Estimators
We provide finitesample analysis of a general framework for using knea...
Fast Stochastic Methods for Nonsmooth Nonconvex Optimization
We analyze stochastic algorithms for optimizing nonconvex, nonsmooth fin...
Efficient Nonparametric Smoothness Estimation
Sobolev quantities (norms, inner products, and distances) of probability...
Generalized Exponential Concentration Inequality for Rényi Divergence Estimation
Estimating divergences in a consistent way is of great importance in man...
Analysis of kNearest Neighbor Distances with Application to Entropy Estimation
Estimating entropy and mutual information consistently is important for ...
Multifidelity Gaussian Process Bandit Optimisation
In many scientific and engineering applications, we are tasked with the ...
Stochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finitesum problems and analyze stochastic variance r...
Fast Incremental Method for Nonconvex Optimization
We analyze a fast incremental aggregated gradient method for optimizing ...
Stochastic Neural Networks with Monotonic Activation Functions
We propose a Laplace approximation that creates a stochastic unit from a...
Boolean Matrix Factorization and Noisy Completion via Message Passing
Boolean matrix factorization and Boolean matrix completion from noisy ob...
Lineartime Learning on Distributions with Approximate Kernel Embeddings
Many interesting machine learning problems are best posed by considering...
Adaptivity and ComputationStatistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
Nonparametric two sample testing is a decision theoretic problem that in...
Bayesian Nonparametric KernelLearning
Kernel methods are ubiquitous tools in machine learning. They have prove...
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
We study optimization algorithms based on variance reduction for stochas...
An Analysis of Active Learning With Uniform Feature Noise
In active learning, the user sequentially chooses values for feature X a...
High Dimensional Bayesian Optimisation and Bandits via Additive Models
Bayesian Optimisation (BO) is a technique used in optimising a Ddimensi...
On the Highdimensional Power of Lineartime Kernel TwoSample Testing under Meandifference Alternatives
Nonparametric two sample testing deals with the question of consistently...
Influence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations
We propose and analyze estimators for statistical functionals of one or ...
Learning Theory for Distribution Regression
We focus on the distribution regression problem: regressing to vectorva...
On Estimating L_2^2 Divergence
We give a comprehensive theoretical characterization of a nonparametric ...
