Decision-making problems of sequential nature, where decisions made in t...
Sparsity of rewards while applying a deep reinforcement learning method
...
We study the problem of preserving privacy while still providing high ut...
We investigate the Multi-Armed Bandit problem with Temporally-Partitione...
We study a posterior sampling approach to efficient exploration in
const...
Fairness-aware learning aims at satisfying various fairness constraints ...
We consider a special case of bandit problems, named batched bandits, in...
We consider a special case of bandit problems, namely batched bandits.
M...
We consider a setting in which the objective is to learn to navigate in ...
We consider undiscounted reinforcement learning in Markov decision proce...
We consider reinforcement learning in changing Markov Decision Processes...
Counterfactual learning is a natural scenario to improve web-based machi...
Machine learning algorithms for prediction are increasingly being used i...
We study a variant of the stochastic multi-armed bandit (MAB) problem in...