Beyond dynamic programming

06/26/2023
by   Abhinav Muraleedharan, et al.
0

In this paper, we present Score-life programming, a novel theoretical approach for solving reinforcement learning problems. In contrast with classical dynamic programming-based methods, our method can search over non-stationary policy functions, and can directly compute optimal infinite horizon action sequences from a given state. The central idea in our method is the construction of a mapping between infinite horizon action sequences and real numbers in a bounded interval. This construction enables us to formulate an optimization problem for directly computing optimal infinite horizon action sequences, without requiring a policy function. We demonstrate the effectiveness of our approach by applying it to nonlinear optimal control problems. Overall, our contributions provide a novel theoretical framework for formulating and solving reinforcement learning problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2022

Contractivity of Bellman Operator in Risk Averse Dynamic Programming with Infinite Horizon

The paper deals with a risk averse dynamic programming problem with infi...
research
06/03/2013

On the Performance Bounds of some Policy Search Dynamic Programming Algorithms

We consider the infinite-horizon discounted optimal control problem form...
research
06/24/2023

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

This paper is concerned with a finite-horizon inverse control problem, w...
research
03/28/2023

Worst-Case Control and Learning Using Partial Observations Over an Infinite Time-Horizon

Safety-critical cyber-physical systems require control strategies whose ...
research
11/26/2019

Deep adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints

This paper presents a constrained deep adaptive dynamic programming (CDA...
research
03/24/2020

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Off-policy estimation for long-horizon problems is important in many rea...
research
01/29/2021

Can Machine Learning Help in Solving Cargo Capacity Management Booking Control Problems?

Revenue management is important for carriers (e.g., airlines and railroa...

Please sign up or login with your details

Forgot password? Click here to reset