Despite recent theoretical progress on the non-convex optimization of
Many machine learning applications require learning a function with a sm...
We present a model-based offline reinforcement learning policy performan...
Real-world machine learning applications often involve deploying neural
Past research on interactive decision making problems (bandits, reinforc...
In the stochastic linear contextual bandit setting there exist several
This paper studies model-based bandit and reinforcement learning (RL) wi...
We consider the adversarial Markov Decision Process (MDP) problem, where...
We study multinomial logit bandit with limited adaptivity, where the
We compare the model-free reinforcement learning with the model-based
In this paper, we consider the problem of online learning of Markov deci...
Goal-oriented reinforcement learning has recently been a practical frame...
A fundamental question in reinforcement learning is whether model-free