In this paper, we study the problem of (finite horizon tabular) Markov
d...
We consider cross-silo federated linear contextual bandit (LCB) problem ...
We present a non-asymptotic lower bound on the eigenspectrum of the desi...
We consider model selection for classic Reinforcement Learning (RL)
envi...
We consider the standard K-armed bandit problem under a distributed trus...
Differential privacy (DP) has been recently introduced to linear context...
We revisit the method of mixture technique, also known as the Laplace me...
We study regret minimization in finite horizon tabular Markov decision
p...
In this paper, we study the problem of regret minimization in reinforcem...
We address the problem of model selection for the finite horizon episodi...
We consider the regret minimisation problem in reinforcement learning (R...
We consider multi-objective optimization (MOO) of an unknown vector-valu...
We develop algorithms with low regret for learning episodic Markov decis...
We present two algorithms for Bayesian optimization in the batch feedbac...
We consider black box optimization of an unknown function in the
nonpara...
We consider online learning for minimizing regret in unknown, episodic M...