research
∙
07/19/2022
Actor-Critic based Improper Reinforcement Learning
We consider an improper reinforcement learning setting where a learner i...
research
∙
02/16/2021
Improper Learning with Gradient-based Policy Optimization
We consider an improper reinforcement learning setting where the learner...
research
∙
06/13/2020
Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners
We study the problem of best arm identification in linearly parameterise...
research
∙
11/05/2019