
Bandit Quickest Changepoint Detection
Detecting abrupt changes in temporal behavior patterns is of interest in...
Better than the Best: Gradientbased Improper Reinforcement Learning for Network Scheduling
We consider the problem of scheduling in constrained queueing networks w...
Improper Learning with Gradientbased Policy Optimization
We consider an improper reinforcement learning setting where the learner...
Stochastic Linear Bandits with Protected Subspace
We study a variant of the stochastic linear bandit problem wherein we op...
Noregret Algorithms for Multitask Bayesian Optimization
We consider multiobjective optimization (MOO) of an unknown vectorvalu...
Sequential Multihypothesis Testing in Multiarmed Bandit Problems:An Approach for Asymptotic Optimality
We consider a multihypothesis testing problem involving a Karmed bandi...
Explicit Best Arm Identification in Linear Bandits Using NoRegret Learners
We study the problem of best arm identification in linearly parameterise...
How Reliable are Test Numbers for Revealing the COVID19 Ground Truth and Applying Interventions?
The number of confirmed cases of COVID19 is often used as a proxy for t...
Regret Minimization in Stochastic Contextual Dueling Bandits
We consider the problem of stochastic Karmed dueling bandit in the cont...
Throughput Optimal Decentralized Scheduling with Singlebit State Feedback for a Class of Queueing Systems
Motivated by medium access control for resourcechallenged wireless Inte...
Bestitem Learning in Random Utility Models with Subset Choices
We consider the problem of PAC learning the most valuable item from a po...
Stability and Scalability of Blockchain Systems
The blockchain paradigm provides a mechanism for content dissemination a...
Sequential Mode Estimation with Oracle Queries
We consider the problem of adaptively PAClearning a probability distrib...
Towards Optimal and Efficient Best Arm Identification in Linear Bandits
We give a new algorithm for best arm identification in linearly paramete...
On Online Learning in Kernelized Markov Decision Processes
We develop algorithms with low regret for learning episodic Markov decis...
On Batch Bayesian Optimization
We present two algorithms for Bayesian optimization in the batch feedbac...
On Adaptivity in Informationconstrained Online Learning
We study how to adapt to smoothlyvarying (`easy') environments in well...
Bayesian Optimization under Heavytailed Payoffs
We consider black box optimization of an unknown function in the nonpara...
From PAC to InstanceOptimal Sample Complexity in the PlackettLuce Model
We consider PAC learning for identifying a good item from subsetwise sa...
Regret Minimisation in Multinomial Logit Bandits
We consider two regret minimisation problems over subsets of a finite gr...
Active Ranking with Subsetwise Preferences
We consider the problem of probably approximately correct (PAC) ranking ...
PACBattling Bandits with PlackettLuce: Tradeoff between Sample Complexity and Subset Size
We introduce the probably approximately correct (PAC) version of the pro...
Online Learning in Kernelized Markov Decision Processes
We consider online learning for minimizing regret in unknown, episodic M...
Optimal Odd Arm Identification with Fixed Confidence
The problem of detecting an odd arm from a set of K arms of a multiarme...
Collaborative Learning of Stochastic Bandits over a Social Network
We consider a collaborative online learning paradigm, wherein a group of...
Thompson Sampling for Learning Parameterized Markov Decision Processes
We consider reinforcement learning in parameterized Markov Decision Proc...
Aditya Gopalan
