
Tiering as a Stochastic Submodular Optimization Problem
Tiering is an essential technique for building largescale information r...
Recognizing Variables from their Data via Deep Embeddings of Distributions
A key obstacle in automated analytics and metalearning is the inability...
Deep Factors for Forecasting
Producing probabilistic forecasts for large collections of similar and/o...
SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
Deep Factors with Gaussian Processes for Forecasting
A large collection of time series poses significant challenges for class...
Deep Graphs
We propose an algorithm for deep learning on networks and graphs. It rel...
Detecting and Correcting for Label Shift with Black Box Predictors
Faced with distribution shift between training and test set, we wish to ...
Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning
Knowledge bases (KB), both automatically and manually constructed, are o...
Efficient Multitask Feature and Relationship Learning
In this paper we propose a multiconvex framework for multitask learnin...
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
We propose a method to optimize the representation and distinguishabilit...
AIDE: Fast and Communication Efficient Distributed Optimization
In this paper, we present two new communicationefficient methods for di...
Stochastic FrankWolfe Methods for Nonconvex Optimization
We study FrankWolfe methods for nonconvex stochastic and finitesum opt...
Neural Machine Translation with Recurrent Attention Modeling
Knowing which words have been attended to in previous time steps while g...
Fast Stochastic Methods for Nonsmooth Nonconvex Optimization
We analyze stochastic algorithms for optimizing nonconvex, nonsmooth fin...
Stochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finitesum problems and analyze stochastic variance r...
Fast Incremental Method for Nonconvex Optimization
We analyze a fast incremental aggregated gradient method for optimizing ...
Data Driven Resource Allocation for Distributed Learning
In distributed machine learning, data is dispatched to multiple machines...
Stacked Attention Networks for Image Question Answering
This paper presents stacked attention networks (SANs) that learn to answ...
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
We study optimization algorithms based on variance reduction for stochas...
Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo
We consider the problem of Bayesian learning on sensitive datasets and p...
Deep Fried Convnets
The fully connected layers of a deep convolutional neural network typica...
Trend Filtering on Graphs
We introduce a family of adaptive estimators on graphs, based on penaliz...
The Falling Factorial Basis and Its Statistical Applications
We study a novel splinelike basis, which we name the "falling factorial...
Randomized Nonlinear Component Analysis
Classical methods such as Principal Component Analysis (PCA) and Canonic...
Exponential Families for Conditional Random Fields
In this paper we de ne conditional random elds in reproducing kernel Hil...
Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations
This paper analyzes the problem of Gaussian process (GP) bandits with de...
SuperSamples from Kernel Herding
We extend the herding algorithm to continuous spaces by using the kernel...
Regret Bounds for Deterministic Gaussian Process Bandits
This paper analyses the problem of Gaussian process (GP) bandits with de...
Alex Smola
Director, Machine Learning at Amazon, Professor at Carnegie Mellon University