
Decision Making Problems with Funnel Structure: A MultiTask Learning Approach with Application to Email Marketing Campaigns
This paper studies the decision making problem with Funnel Structure. Fu...
Federated Learning via Synthetic Data
Federated learning allows for the training of a model using data on mult...
TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search
Molecular geometry prediction of flexible molecules, or conformer search...
LowRank Generalized Linear Bandit Problems
In a lowrank linear bandit problem, the reward of an action (represente...
On the Equivalence between Online and Private Learnability beyond Binary Classification
Alon et al. [2019] and Bun et al. [2020] recently showed that online lea...
On Learnability under General Stochastic Processes
Statistical learning theory under independent and identically distribute...
Nearoptimal Reinforcement Learning in Factored MDPs: OracleEfficient Algorithms for the Nonepisodic Setting
We study reinforcement learning in factored Markov decision processes (F...
Nearoptimal Oracleefficient Algorithms for Stationary and NonStationary Stochastic Linear Bandits
We investigate the design of two algorithms that enjoy not only computat...
Online Boosting for Multilabel Ranking with Topk Feedback
We present online boosting algorithms for multilabel ranking with topk ...
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Reinforcement learning (RL) methods have been shown to be capable of lea...
Thompson Sampling in NonEpisodic Restless Bandits
Restless bandit problems assume timevarying reward distributions of the...
What You See May Not Be What You Get: UCB Bandit Algorithms Robust to εContamination
Motivated by applications of bandit algorithms in education, we consider...
Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning
Active learning seeks to build the best possible model with a budget of ...
Regret Analysis of Causal Bandit Problems
We study how to learn optimal interventions sequentially given causal in...
Regret Bounds for Thompson Sampling in Restless Bandit Problems
Restless bandit problems are instances of nonstationary multiarmed ban...
Generalization Bounds in the PredictthenOptimize Framework
The predictthenoptimize framework is fundamental in many practical set...
Randomized Algorithms for DataDriven Stabilization of Stochastic Linear Systems
Datadriven control strategies for dynamical systems with unknown parame...
Contextual Markov Decision Processes using Generalized Linear Models
We consider the recently proposed reinforcement learning (RL) framework ...
On Applications of Bootstrap in Continuous Space Reinforcement Learning
In decision making problems for continuous state and action spaces, line...
On the Optimality of Perturbations in Stochastic and Adversarial Multiarmed Bandit Problems
We investigate the optimality of perturbation based algorithms in the st...
Input Perturbations for Adaptive Regulation and Learning
Design of adaptive algorithms for simultaneous regulation and estimation...
Online Multiclass Boosting with Bandit Feedback
We present online boosting algorithms for multiclass classification with...
Fighting Contextual Bandits with Stochastic Smoothing
We introduce a new stochastic smoothing perspective to study adversarial...
Random ReLU Features: Universality, Approximation, and Composition
We propose random ReLU features models in this work. Its motivation is r...
But How Does It Work in Theory? Linear SVM with Random Features
We prove that, under low noise assumptions, the support vector machine w...
Finite Time Adaptive Stabilization of LQ Systems
Stabilization of linear systems with unknown dynamics is a canonical pro...
On Optimality of Adaptive LinearQuadratic Regulators
Adaptive regulation of linear systems represents a canonical problem in ...
Finite Time Analysis of Optimal Adaptive Policies for LinearQuadratic Systems
We consider the classical problem of control of linear systems with quad...
Markov Decision Processes with Continuous Side Information
We consider a reinforcement learning (RL) setting in which the agent int...
Action Centered Contextual Bandits
Contextual bandits have become popular as they offer a middle ground bet...
Online Boosting Algorithms for Multilabel Ranking
We consider the multilabel ranking approach to multilabel learning. Bo...
An ActorCritic Contextual Bandit Algorithm for Personalized Mobile Health Interventions
Increasing technological sophistication and widespread use of smartphone...
Online Multiclass Boosting
Recent work has extended the theoretical analysis of boosting algorithms...
Beyond the Hazard Rate: More Perturbation Algorithms for Adversarial Multiarmed Bandits
Recent work on follow the perturbed leader (FTPL) algorithms for the adv...
Sampled Fictitious Play is Hannan Consistent
Fictitious play is a simple and widely studied adaptive heuristic for pl...
Mixture Proportion Estimation via Kernel Embedding of Distributions
Mixture proportion estimation (MPE) is the problem of estimating the wei...
Lasso Guarantees for Time Series Estimation Under Subgaussian Tails and βMixing
Many theoretical results on estimation of high dimensional time series r...
Fighting Bandits with a New Kind of Smoothness
We define a novel family of algorithms for the adversarial multiarmed b...
Handling Class Imbalance in Link Prediction using Learning to Rank Techniques
We consider the link prediction problem in a partially observed network,...
Perceptron like Algorithms for Online Learning to Rank
Perceptron is a classic online algorithm for learning a classification f...
Consistent Algorithms for Multiclass Classification with a Reject Option
We consider the problem of nclass classification (n≥ 2), where the clas...
On Iterative Hard Thresholding Methods for Highdimensional MEstimation
The use of Mestimators in generalized linear regression models in high ...
Perceptronlike Algorithms and Generalization Bounds for Learning to Rank
Learning to rank is a supervised learning problem where the output space...
On Lipschitz Continuity and Smoothness of Loss Functions in Learning to Rank
In binary classification and regression problems, it is well understood ...
Feature Clustering for Accelerating Parallel Coordinate Descent
Largescale L1regularized loss minimization problems arise in highdime...
The Interplay Between Stability and Regret in Online Learning
This paper considers the stability of online learning algorithms and its...
Scaling Up Coordinate Descent Algorithms for Large ℓ_1 Regularization Problems
We present a generic framework for parallel coordinate descent (CD) algo...
Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret
Online learning algorithms are designed to learn even when their input i...
Orthogonal Matching Pursuit with Replacement
In this paper, we consider the problem of compressed sensing where the g...
Online Learning: Stochastic and Constrained Adversaries
Learning theory has largely focused on two main learning scenarios. The ...
Ambuj Tewari
Associate Professor of Statistics, Associate Professor of Electrical Engineering and Computer Science