
Thompson Sampling for CVaR Bandits
Risk awareness is an important feature to formulate a variety of real wo...
read it

Subsampling for Efficient NonParametric Bandit Exploration
In this paper we propose the first multiarmed bandit algorithm based on...
read it

Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
In this paper, we propose new problemindependent lower bounds on the sa...
read it

Fast active learning for pure exploration in reinforcement learning
Realistic environments often provide agents with very limited feedback. ...
read it

A KernelBased Approach to NonStationary Reinforcement Learning in Metric Spaces
In this work, we propose KeRNS: an algorithm for episodic reinforcement ...
read it

Adaptive RewardFree Exploration
Rewardfree exploration is a reinforcement learning setting recently stu...
read it

Planning in Markov Decision Processes with GapDependent Sample Complexity
We propose MDPGapE, a new trajectorybased MonteCarlo Tree Search algo...
read it

Regret Bounds for KernelBased Reinforcement Learning
We consider the explorationexploitation dilemma in finitehorizon reinf...
read it

Solving Bernoulli RankOne Bandits with Unimodal Thompson Sampling
Stochastic RankOne Bandits (Katarya et al, (2017a,b)) are a simple fram...
read it

FixedConfidence Guarantees for Bayesian BestArm Identification
We investigate and provide new insights on the sampling rule called Top...
read it

NonAsymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models
In this paper, we study sequential testing problems with overlapping hyp...
read it

On MultiArmed Bandit Designs for Phase I Clinical Trials
We study the problem of finding the optimal dosage in a phase I clinical...
read it

The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for PieceWise NonStationary Bandits
We propose a new algorithm for the piecewise nonstationary bandit pro...
read it

New Algorithms for Multiplayer Bandits when Arm Means Vary Among Players
We study multiplayer stochastic multiarmed bandit problems in which the...
read it

Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals
This paper presents new deviation inequalities that are valid uniformly ...
read it

MultiArmed Bandit Learning in IoT Networks: Learning helps even in nonstationary settings
Setting up the future Internet of Things (IoT) networks will require to ...
read it

Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling
Learning the minimum/maximum mean among a finite set of distributions is...
read it

What Doubling Tricks Can and Can't Do for MultiArmed Bandits
An online reinforcement learning algorithm is anytime if it does not nee...
read it

Pure Exploration in InfinitelyArmed Bandit Models with FixedConfidence
We consider the problem of nearoptimal arm identification in the fixed ...
read it

MultiPlayer Bandits Models Revisited
Multiplayer MultiArmed Bandits (MAB) have been extensively studied in ...
read it

Corrupt Bandits for Preserving Local Privacy
We study a variant of the stochastic multiarmed bandit (MAB) problem in...
read it

MonteCarlo Tree Search by Best Arm Identification
Recent advances in bandit tools and techniques for sequential learning a...
read it

Learning the distribution with largest mean: two bandit frameworks
Over the past few years, the multiarmed bandit model has become increas...
read it

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits
We study a generalization of the multiarmed bandit problem with multipl...
read it

Maximin Action Identification: A New Bandit Framework for Games
We study an original problem of pure exploration in a strategic bandit m...
read it

Optimal Best Arm Identification with Fixed Confidence
We give a complete characterization of the complexity of bestarm identi...
read it

On Bayesian index policies for sequential resource allocation
This paper is about index policies for minimizing (frequentist) regret i...
read it

A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks
This paper presents a novel spectral algorithm with additive clustering ...
read it

On the Complexity of Best Arm Identification in MultiArmed Bandit Models
The stochastic multiarmed bandit model is a simple abstraction that has...
read it

On the Complexity of A/B Testing
A/B testing refers to the task of determining the best option among two ...
read it

Thompson Sampling for 1Dimensional Exponential Family Bandits
Thompson Sampling has been demonstrated in many complex bandit models, h...
read it

Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
The question of the optimality of Thompson Sampling for solving the stoc...
read it
Emilie Kaufmann
is this you? claim profile
CNRS Junior Researcher in the CRIStAL at Université de Lille