Fundamental Performance Limits for Sensor-Based Robot Control and Policy Learning

01/31/2022
by   Anirudha Majumdar, et al.
0

Our goal is to develop theory and algorithms for establishing fundamental limits on performance for a given task imposed by a robot's sensors. In order to achieve this, we define a quantity that captures the amount of task-relevant information provided by a sensor. Using a novel version of the generalized Fano inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for one-step decision making tasks. We then extend this bound to multi-step problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freely-falling object, and (iii) obstacle avoidance using a depth sensor with non-Gaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).

READ FULL TEXT

page 1

page 6

page 7

research
02/02/2023

Lower Bounds for Learning in Revealing POMDPs

This paper studies the fundamental limits of reinforcement learning (RL)...
research
10/24/2021

Off-Policy Evaluation in Partially Observed Markov Decision Processes

We consider off-policy evaluation of dynamic treatment rules under the a...
research
06/11/2018

PAC-Bayes Control: Synthesizing Controllers that Provably Generalize to Novel Environments

Our goal is to synthesize controllers for robots that provably generaliz...
research
02/12/2014

Planning for Decentralized Control of Multiple Robots Under Uncertainty

We describe a probabilistic framework for synthesizing control policies ...
research
05/02/2021

High Dimensional Decision Making, Upper and Lower Bounds

A decision maker's utility depends on her action a∈ A ⊂ℝ^d and the payof...
research
06/14/2023

Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Partially observable Markov decision processes (POMDPs) have been widely...
research
09/20/2018

Task-Driven Estimation and Control via Information Bottlenecks

Our goal is to develop a principled and general algorithmic framework fo...

Please sign up or login with your details

Forgot password? Click here to reset