Optimizing pre-scheduled, intermittently-observed MDPs

05/16/2023
by   Patrick Zhong, et al.
0

A challenging category of robotics problems arises when sensing incurs substantial costs. This paper examines settings in which a robot wishes to limit its observations of state, for instance, motivated by specific considerations of energy management, stealth, or implicit coordination. We formulate the problem of planning under uncertainty when the robot's observations are intermittent but their timing is known via a pre-declared schedule. After having established the appropriate notion of an optimal policy for such settings, we tackle the problem of joint optimization of the cumulative execution cost and the number of state observations, both in expectation under discounts. To approach this multi-objective optimization problem, we introduce an algorithm that can identify the Pareto front for a class of schedules that are advantageous in the discounted setting. The algorithm proceeds in an accumulative fashion, prepending additions to a working set of schedules and then computing incremental changes to the value functions. Because full exhaustive construction becomes computationally prohibitive for moderate-sized problems, we propose a filtering approach to prune the working set. Empirical results demonstrate that this filtering is effective at reducing computation while incurring only negligible reduction in quality. In summarizing our findings, we provide some characterization of the run-time vs quality trade-off involved.

READ FULL TEXT

page 6

page 7

research
06/01/2022

Error-Bounded Approximation of Pareto Fronts in Robot Planning Problems

Many problems in robotics seek to simultaneously optimize several compet...
research
05/30/2018

A Flexible Multi-Objective Bayesian Optimization Approach using Random Scalarizations

Many real world applications can be framed as multi-objective optimizati...
research
01/24/2022

IMO^3: Interactive Multi-Objective Off-Policy Optimization

Most real-world optimization problems have multiple objectives. A system...
research
03/02/2022

Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP

Travelling salesperson problem (TSP) is a classic resource allocation pr...
research
05/10/2017

Solving Multi-Objective MDP with Lexicographic Preference: An application to stochastic planning with multiple quantile objective

In most common settings of Markov Decision Process (MDP), an agent evalu...
research
04/27/2015

Further Connections Between Contract-Scheduling and Ray-Searching Problems

This paper addresses two classes of different, yet interrelated optimiza...

Please sign up or login with your details

Forgot password? Click here to reset