A Unified Algorithm for Stochastic Path Problems

10/17/2022
by   Christoph Dann, et al.
0

We study reinforcement learning in stochastic path (SP) problems. The goal in these problems is to maximize the expected sum of rewards until the agent reaches a terminal state. We provide the first regret guarantees in this general problem by analyzing a simple optimistic algorithm. Our regret bound matches the best known results for the well-studied special case of stochastic shortest path (SSP) with all non-positive rewards. For SSP, we present an adaptation procedure for the case when the scale of rewards B_⋆ is unknown. We show that there is no price for adaptation, and our regret bound matches that with a known B_⋆. We also provide a scale adaptation procedure for the special case of stochastic longest paths (SLP) where all rewards are non-negative. However, unlike in SSP, we show through a lower bound that there is an unavoidable price for adaptation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2021

Learning Stochastic Shortest Path with Linear Function Approximation

We study the stochastic shortest path (SSP) problem in reinforcement lea...
research
02/23/2020

Near-optimal Regret Bounds for Stochastic Shortest Path

Stochastic shortest path (SSP) is a well-known problem in planning and c...
research
08/16/2017

Corrupt Bandits for Preserving Local Privacy

We study a variant of the stochastic multi-armed bandit (MAB) problem in...
research
05/04/2021

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

We propose two algorithms for episodic stochastic shortest path problems...
research
06/20/2020

Adversarial Stochastic Shortest Path

Stochastic shortest path (SSP) is a well-known problem in planning and c...
research
02/14/2012

Suboptimality Bounds for Stochastic Shortest Path Problems

We consider how to use the Bellman residual of the dynamic programming o...
research
02/01/2023

Uniswap Liquidity Provision: An Online Learning Approach

Decentralized Exchanges (DEXs) are new types of marketplaces leveraging ...

Please sign up or login with your details

Forgot password? Click here to reset