Path Planning using Reinforcement Learning: A Policy Iteration Approach

03/13/2023
by   Saumil Shivdikar, et al.
0

With the impact of real-time processing being realized in the recent past, the need for efficient implementations of reinforcement learning algorithms has been on the rise. Albeit the numerous advantages of Bellman equations utilized in RL algorithms, they are not without the large search space of design parameters. This research aims to shed light on the design space exploration associated with reinforcement learning parameters, specifically that of Policy Iteration. Given the large computational expenses of fine-tuning the parameters of reinforcement learning algorithms, we propose an auto-tuner-based ordinal regression approach to accelerate the process of exploring these parameters and, in return, accelerate convergence towards an optimal policy. Our approach provides 1.82x peak speedup with an average of 1.48x speedup over the previous state-of-the-art.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

research
02/10/2018

Beyond the One Step Greedy Approach in Reinforcement Learning

The famous Policy Iteration algorithm alternates between policy improvem...
research
09/30/2021

Scalable Online Planning via Reinforcement Learning Fine-Tuning

Lookahead search has been a critical component of recent AI successes, s...
research
02/28/2020

Mixed Reinforcement Learning with Additive Stochastic Uncertainty

Reinforcement learning (RL) methods often rely on massive exploration da...
research
03/21/2020

Deep Reinforcement Learning with Smooth Policy

Deep neural networks have been widely adopted in modern reinforcement le...
research
02/23/2019

Distributionally Robust Reinforcement Learning

Generalization to unknown/uncertain environments of reinforcement learni...
research
01/31/2022

Reinforcement Learning with Heterogeneous Data: Estimation and Inference

Reinforcement Learning (RL) has the promise of providing data-driven sup...
research
07/19/2021

Reward-Weighted Regression Converges to a Global Optimum

Reward-Weighted Regression (RWR) belongs to a family of widely known ite...

Please sign up or login with your details

Forgot password? Click here to reset