
Pure Exploration and Regret Minimization in Matching Bandits
Finding an optimal matching in a weighted graph is a standard combinator...
read it

Online Matching in Sparse Random Graphs: NonAsymptotic Performances of Greedy Algorithm
Motivated by sequential budgeted allocation problems, we investigate onl...
read it

Unsupervised Neural Hidden Markov Models with a Continuous latent state space
We introduce a new procedure to neuralize unsupervised Hidden Markov Mod...
read it

Offline Inverse Reinforcement Learning
The objective of offline RL is to learn optimal policies when a fixed ex...
read it

Decentralized Learning in Online Queuing Systems
Motivated by packet routing in computer networks, online queuing systems...
read it

A Generalised Inverse Reinforcement Learning Framework
The gloabal objective of inverse Reinforcement Learning (IRL) is to esti...
read it

Homomorphically Encrypted Linear Contextual Bandit
Contextual bandit is a general framework for online learning in sequenti...
read it

Making the most of your day: online learning for optimal allocation of time
We study online learning for optimal allocation when the resource to be ...
read it

Be Greedy in MultiArmed Bandits
The Greedy algorithm is the simplest heuristic in sequential decision pr...
read it

Lifelong Learning in MultiArmed Bandits
Continuously learning and leveraging the knowledge accumulated from prio...
read it

Learning in repeated auctions
Auction theory historically focused on the question of designing the bes...
read it

Robustness of Community Detection to Random Geometric Perturbations
We consider the stochastic block model where connection between vertices...
read it

Local Differentially Private Regret Minimization in Reinforcement Learning
Reinforcement learning algorithms are widely used in domains where it is...
read it

Speed of Social Learning from Reviews in NonStationary Environments
Potential buyers of a product or service tend to first browse feedback f...
read it

Statistical Efficiency of Thompson Sampling for Combinatorial SemiBandits
We investigate stochastic combinatorial multiarmed bandit with semiban...
read it

Categorized Bandits
We introduce a new stochastic multiarmed bandit setting where arms are ...
read it

Selfish Robustness and Equilibria in MultiPlayer Bandits
Motivated by cognitive radios, stochastic multiplayer multiarmed bandi...
read it

Adversarial learning for revenuemaximizing auctions
We introduce a new numerical framework to learn optimal bidding strategi...
read it

Markov Decision Process for MOOC users behavioral inference
Studies on massive open online courses (MOOCs) users discuss the existen...
read it

Active Linear Regression
We consider the problem of active linear regression where a decision mak...
read it

Robust Stackelberg buyers in repeated auctions
We consider the practical and classical setting where the seller is usin...
read it

Repeated A/B Testing
We study a setting in which a learner faces a sequence of A/B tests and ...
read it

Private Learning and Regularized Optimal Transport
Private data are valuable either by remaining private (for instance if t...
read it

Learning to bid in revenuemaximizing auctions
We consider the problem of the optimization of bidding strategies in pri...
read it

A ProblemAdaptive Algorithm for Resource Allocation
We consider a sequential stochastic resource allocation problem under th...
read it

Exploiting Structure of Uncertainty for Efficient Combinatorial SemiBandits
We improve the efficiency of algorithms for stochastic combinatorial sem...
read it

A differential game on Wasserstein space. Application to weak approachability with partial monitoring
Studying continuous time counterpart of some discrete time dynamics is n...
read it

Regularized Contextual Bandits
We consider the stochastic contextual bandit problem with additional reg...
read it

Bridging the gap between regret minimization and best arm identification, with application to A/B tests
State of the art online learning procedures focus either on selecting th...
read it

SICMMAB: Synchronisation Involves Communication in Multiplayer MultiArmed Bandits
We consider the stochastic multiplayer multiarmed bandit problem, where...
read it

Thresholding the virtual value: a simple method to increase welfare and lower reserve prices in online auction systems
Second price auctions with reserve price are widely used by the main Int...
read it

Bandits with Side Observations: Bounded vs. Logarithmic Regret
We consider the classical stochastic multiarmed bandit but where, from ...
read it

Dynamic Pricing with Finitely Many Unknown Valuations
Motivated by posted price auctions where buyers are grouped in an unknow...
read it

Finding the Bandit in a Graph: Sequential SearchandStop
We consider the problem where an agent wants to find a hidden object tha...
read it

Explicit shading strategies for repeated truthful auctions
With the increasing use of auctions in online advertising, there has bee...
read it

A comparative study of counterfactual estimators
We provide a comparative study of several widely used offpolicy estimat...
read it

Fast Rates for Bandit Optimization with UpperConfidence FrankWolfe
We consider the problem of bandit optimization, inspired by stochastic o...
read it

Approachability of convex sets in generalized quitting games
We consider Blackwell approachability, a very powerful and geometric too...
read it

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case
We demonstrate that, in the classical nonstochastic regret minimization...
read it

Online learning in repeated auctions
Motivated by online advertising auctions, we consider repeated Vickrey a...
read it

Approachability in unknown games: Online learning meets multiobjective optimization
In the standard setting of approachability there are two players and a t...
read it

Gaussian Process Optimization with Mutual Information
In this paper, we analyze a generic algorithm scheme for sequential glob...
read it

A Primal Condition for Approachability with Partial Monitoring
In approachability with full monitoring there are two types of condition...
read it

Bounded regret in stochastic multiarmed bandits
We study the stochastic multiarmed bandit problem when one knows the va...
read it

The multiarmed bandit problem with covariates
We consider a multiarmed bandit problem in a setting where each arm pro...
read it
Vianney Perchet
is this you? claim profile