Despite recent theoretical progress on the non-convex optimization of
tw...
Many machine learning applications require learning a function with a sm...
We present a model-based offline reinforcement learning policy performan...
Real-world machine learning applications often involve deploying neural
...
Past research on interactive decision making problems (bandits, reinforc...
In the stochastic linear contextual bandit setting there exist several
m...
This paper studies model-based bandit and reinforcement learning (RL) wi...
We consider the adversarial Markov Decision Process (MDP) problem, where...
We study multinomial logit bandit with limited adaptivity, where the
alg...
We compare the model-free reinforcement learning with the model-based
ap...
In this paper, we consider the problem of online learning of Markov deci...
Goal-oriented reinforcement learning has recently been a practical frame...
A fundamental question in reinforcement learning is whether model-free
a...