
Gradient Descent Provably Optimizes Overparameterized Neural Networks
One of the mystery in the success of neural networks is randomly initial...
read it

A Deep Reinforcement Learning Approach for Global Routing
Global routing has been a historically challenging problem in electronic...
read it

ProBO: a Framework for Using Probabilistic Programming in Bayesian Optimization
Optimizing an expensivetoquery function is a common task in science an...
read it

Competencebased Curriculum Learning for Neural Machine Translation
Current stateoftheart NMT systems use large neural networks that are ...
read it

Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly
Bayesian Optimisation (BO), refers to a suite of techniques for global o...
read it

LucidDream: Controlled TemporallyConsistent DeepDream on Videos
In this work, we aim to propose a set of techniques to improve the contr...
read it

Unsupervised Program Synthesis for Images using TreeStructured LSTM
Program synthesis has recently emerged as a promising approach to the im...
read it

Developing Creative AI to Generate Sculptural Objects
We explore the intersection of human and machine creativity by generatin...
read it

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
While graph kernels (GKs) are easy to train and enjoy provable theoretic...
read it

Cautious Deep Learning
Most classifiers operate by selecting the maximum of an estimate of the ...
read it

Estimating Cosmological Parameters from the Dark Matter Distribution
A grand challenge of the 21st century cosmology is to accurately estimat...
read it

A Generic Approach for Escaping Saddle points
A central challenge to using firstorder methods for optimizing nonconve...
read it

On the Reconstruction Risk of Convolutional Sparse Dictionary Learning
Sparse dictionary learning (SDL) has become a popular method for adaptiv...
read it

Recurrent Estimation of Distributions
This paper presents the recurrent estimation of distributions (RED) for ...
read it

Gradient Descent Can Take Exponential Time to Escape Saddle Points
Although gradient descent (GD) almost always escapes saddle points asymp...
read it

Asynchronous Parallel Bayesian Optimisation via Thompson Sampling
We design and analyse variations of the classical Thompson sampling (TS)...
read it

MMD GAN: Towards Deeper Understanding of Moment Matching Network
Generative moment matching network (GMMN) is a deep generative model tha...
read it

Datadriven Random Fourier Features using Stein Effect
Largescale kernel approximation is an important problem in machine lear...
read it

Multifidelity Bayesian Optimisation with Continuous Approximations
Bandit methods for blackbox optimisation, such as Bayesian optimisation...
read it

Deep Sets
In this paper, we study the problem of designing objective functions for...
read it

The Statistical Recurrent Unit
Sophisticated gated recurrent neural network architectures like LSTMs an...
read it

Equivariance Through ParameterSharing
We propose to study equivariance in deep neural networks through paramet...
read it

Nonparanormal Information Estimation
We study the problem of using i.i.d. samples from an unknown multivariat...
read it

Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning
A common problem in disciplines of applied Statistics research such as A...
read it

Deep Learning with Sets and Point Clouds
We introduce a simple permutation equivariant layer for deep learning wi...
read it

Annealing Gaussian into ReLU: a New Sampling Strategy for LeakyReLU RBM
Restricted Boltzmann Machine (RBM) is a bipartite graphical model that i...
read it

One Network to Solve Them All  Solving Linear Inverse Problems using Deep Projection Models
While deep learning methods have achieved stateoftheart performance i...
read it

Enabling Dark Energy Science with Deep Generative Models of Galaxy Images
Understanding the nature of dark energy, the mysterious force driving th...
read it

AIDE: Fast and Communication Efficient Distributed Optimization
In this paper, we present two new communicationefficient methods for di...
read it

Stochastic FrankWolfe Methods for Nonconvex Optimization
We study FrankWolfe methods for nonconvex stochastic and finitesum opt...
read it

FiniteSample Analysis of Fixedk Nearest Neighbor Density Functional Estimators
We provide finitesample analysis of a general framework for using knea...
read it

Fast Stochastic Methods for Nonsmooth Nonconvex Optimization
We analyze stochastic algorithms for optimizing nonconvex, nonsmooth fin...
read it

Efficient Nonparametric Smoothness Estimation
Sobolev quantities (norms, inner products, and distances) of probability...
read it

Generalized Exponential Concentration Inequality for Rényi Divergence Estimation
Estimating divergences in a consistent way is of great importance in man...
read it

Analysis of kNearest Neighbor Distances with Application to Entropy Estimation
Estimating entropy and mutual information consistently is important for ...
read it

Multifidelity Gaussian Process Bandit Optimisation
In many scientific and engineering applications, we are tasked with the ...
read it

Stochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finitesum problems and analyze stochastic variance r...
read it

Fast Incremental Method for Nonconvex Optimization
We analyze a fast incremental aggregated gradient method for optimizing ...
read it

Stochastic Neural Networks with Monotonic Activation Functions
We propose a Laplace approximation that creates a stochastic unit from a...
read it

Boolean Matrix Factorization and Noisy Completion via Message Passing
Boolean matrix factorization and Boolean matrix completion from noisy ob...
read it

Lineartime Learning on Distributions with Approximate Kernel Embeddings
Many interesting machine learning problems are best posed by considering...
read it

Adaptivity and ComputationStatistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
Nonparametric two sample testing is a decision theoretic problem that in...
read it

Bayesian Nonparametric KernelLearning
Kernel methods are ubiquitous tools in machine learning. They have prove...
read it

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
We study optimization algorithms based on variance reduction for stochas...
read it

An Analysis of Active Learning With Uniform Feature Noise
In active learning, the user sequentially chooses values for feature X a...
read it

High Dimensional Bayesian Optimisation and Bandits via Additive Models
Bayesian Optimisation (BO) is a technique used in optimising a Ddimensi...
read it

On the Highdimensional Power of Lineartime Kernel TwoSample Testing under Meandifference Alternatives
Nonparametric two sample testing deals with the question of consistently...
read it

Influence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations
We propose and analyze estimators for statistical functionals of one or ...
read it

Learning Theory for Distribution Regression
We focus on the distribution regression problem: regressing to vectorva...
read it

On Estimating L_2^2 Divergence
We give a comprehensive theoretical characterization of a nonparametric ...
read it