
Provably Efficient Reinforcement Learning with Aggregated States
We establish that an optimistic variant of Qlearning applied to a finit...
read it

FineGrained GapDependent Bounds for Tabular MDPs via Adaptive MultiStep Bootstrap
This paper presents a new modelfree algorithm for episodic finitehoriz...
read it

Minimax Regret Bounds for Reinforcement Learning
We consider the problem of provably optimal exploration in reinforcement...
read it

Learning Zerosum Stochastic Games with Posterior Sampling
In this paper, we propose Posterior Sampling Reinforcement Learning for ...
read it

Correcting Momentum in Temporal Difference Learning
A common optimization tool used in deep reinforcement learning is moment...
read it

Efficient BiasSpanConstrained ExplorationExploitation in Reinforcement Learning
We introduce SCAL, an algorithm designed to perform efficient exploratio...
read it

Nearly HorizonFree Offline Reinforcement Learning
We revisit offline reinforcement learning on episodic timehomogeneous t...
read it
UCB Momentum Qlearning: Correcting the bias without forgetting
We propose UCBMQ, Upper Confidence Bound Momentum Qlearning, a new algorithm for reinforcement learning in tabular and possibly stagedependent, episodic Markov decision process. UCBMQ is based on Qlearning where we add a momentum term and rely on the principle of optimism in face of uncertainty to deal with exploration. Our new technical ingredient of UCBMQ is the use of momentum to correct the bias that Qlearning suffers while, at the same time, limiting the impact it has on the secondorder term of the regret. For UCBMQ , we are able to guarantee a regret of at most O(√(H^3SAT)+ H^4 S A ) where H is the length of an episode, S the number of states, A the number of actions, T the number of episodes and ignoring terms in polylog(SAHT). Notably, UCBMQ is the first algorithm that simultaneously matches the lower bound of Ω(√(H^3SAT)) for large enough T and has a secondorder term (with respect to the horizon T) that scales only linearly with the number of states S.
READ FULL TEXT
Comments
There are no comments yet.