
A comparative study of counterfactual estimators
We provide a comparative study of several widely used offpolicy estimat...
Fast Rates for Bandit Optimization with UpperConfidence FrankWolfe
We consider the problem of bandit optimization, inspired by stochastic o...
Approachability of convex sets in generalized quitting games
We consider Blackwell approachability, a very powerful and geometric too...
Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case
We demonstrate that, in the classical nonstochastic regret minimization...
Online learning in repeated auctions
Motivated by online advertising auctions, we consider repeated Vickrey a...
Approachability in unknown games: Online learning meets multiobjective optimization
In the standard setting of approachability there are two players and a t...
Gaussian Process Optimization with Mutual Information
In this paper, we analyze a generic algorithm scheme for sequential glob...
A Primal Condition for Approachability with Partial Monitoring
In approachability with full monitoring there are two types of condition...
Bounded regret in stochastic multiarmed bandits
We study the stochastic multiarmed bandit problem when one knows the va...
The multiarmed bandit problem with covariates
We consider a multiarmed bandit problem in a setting where each arm pro...
Explicit shading strategies for repeated truthful auctions
With the increasing use of auctions in online advertising, there has bee...
Finding the Bandit in a Graph: Sequential SearchandStop
We consider the problem where an agent wants to find a hidden object tha...
Dynamic Pricing with Finitely Many Unknown Valuations
Motivated by posted price auctions where buyers are grouped in an unknow...
Bandits with Side Observations: Bounded vs. Logarithmic Regret
We consider the classical stochastic multiarmed bandit but where, from ...
SICMMAB: Synchronisation Involves Communication in Multiplayer MultiArmed Bandits
We consider the stochastic multiplayer multiarmed bandit problem, where...
Bridging the gap between regret minimization and best arm identification, with application to A/B tests
State of the art online learning procedures focus either on selecting th...
Regularized Contextual Bandits
We consider the stochastic contextual bandit problem with additional reg...
Thresholding the virtual value: a simple method to increase welfare and lower reserve prices in online auction systems
Second price auctions with reserve price are widely used by the main Int...
Exploiting Structure of Uncertainty for Efficient Combinatorial SemiBandits
We improve the efficiency of algorithms for stochastic combinatorial sem...
A ProblemAdaptive Algorithm for Resource Allocation
We consider a sequential stochastic resource allocation problem under th...
Learning to bid in revenuemaximizing auctions
We consider the problem of the optimization of bidding strategies in pri...
A differential game on Wasserstein space. Application to weak approachability with partial monitoring
Studying continuous time counterpart of some discrete time dynamics is n...
Repeated A/B Testing
We study a setting in which a learner faces a sequence of A/B tests and ...
Active Linear Regression
We consider the problem of active linear regression where a decision mak...
Markov Decision Process for MOOC users behavioral inference
Studies on massive open online courses (MOOCs) users discuss the existen...
Private Learning and Regularized Optimal Transport
Private data are valuable either by remaining private (for instance if t...
Robust Stackelberg buyers in repeated auctions
We consider the practical and classical setting where the seller is usin...
Adversarial learning for revenuemaximizing auctions
We introduce a new numerical framework to learn optimal bidding strategi...
Selfish Robustness and Equilibria in MultiPlayer Bandits
Motivated by cognitive radios, stochastic multiplayer multiarmed bandi...
Categorized Bandits
We introduce a new stochastic multiarmed bandit setting where arms are ...
