Probabilistic Inference in Planning for Partially Observable Long Horizon Problems

10/18/2021 ∙ by Alphonsus Adu-Bredu, et al. ∙ JPMorgan Chase & Co. 0

For autonomous service robots to successfully perform long horizon tasks in the real world, they must act intelligently in partially observable environments. Most Task and Motion Planning approaches assume full observability of their state space, making them ineffective in stochastic and partially observable domains that reflect the uncertainties in the real world. We propose an online planning and execution approach for performing long horizon tasks in partially observable domains. Given the robot's belief and a plan skeleton composed of symbolic actions, our approach grounds each symbolic action by inferring continuous action parameters needed to execute the plan successfully. To achieve this, we formulate the problem of joint inference of action parameters as a Hybrid Constraint Satisfaction Problem (H-CSP) and solve the H-CSP using Belief Propagation. The robot executes the resulting parameterized actions, updates its belief of the world and replans when necessary. Our approach is able to efficiently solve partially observable tasks in a realistic kitchen simulation environment. Our approach outperformed an adaptation of the state-of-the-art method across our experiments.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Autonomous service robots have the potential to perform long horizon tasks such as cooking meals in restaurants and homes and setting tables. In order for this potential to be realized, such robots would have to plan actions over large state and time horizons. They would also have to account for the uncertainties in their perception and knowledge of their environment. To ensure tractability, planning for such long horizon tasks is often decomposed into planning for symbolic actions and for continuous motions. The class of approaches that interleave symbolic and continuous planning is called integrated Task and Motion Planning (TAMP) [hpn, ffrob, srivastava, garrettfactored, lgp, tampsurvey].

Fig. 1: Given vague goals to make a snack, the robot generates and executes a coherent plan to successfully complete the assigned task.

Major challenges that robots planning and acting in the real world face are the uncertainty in the robot’s knowledge of the current state of the world and uncertainty in the effects of the robot’s actions on the future state of the world. If these uncertainties are not accounted for when planning, the robot is likely to fail to accomplish the task at hand. Most Task and Motion Planning approaches [hpn, ffrob, srivastava, garrettfactored, lgp, tampsurvey] assume full observability of their state space leading them to fail in stochastic and partially observable domains that reflect the uncertainties in the real world. We propose Satisfying HYbrid COnstraints with Belief pRopAgation (SHY-COBRA) as an approach for planning for long horizon partially observable TAMP problems. SHY-COBRA takes as inputs the robot’s noisy belief of the state of the world and a plan skeleton composed of symbolic actions that achieve a specified goal. SHY-COBRA then infers satisfying parameter values for the actions needed to execute the plan successfully in partially observable domains. Given the robot’s noisy belief of the state of the world and a plan skeleton composed of symbolic actions, SHY-COBRA formulates the problem of joint inference of action parameters as a Hybrid Constraint Satisfaction problem (H-CSP)[tlpskeleton], which is represented as a factor graph. The factors in the factor graph are the action constraints whilst the variables are the symbolic and continuous action parameters to be inferred. The continuous parameters are initialized by the robot’s noisy belief of its environment. In most TAMP approaches [hpn, ffrob, srivastava, garrettfactored, lgp, tampsurvey]

, the H-CSP is solved either by sampling or by constrained optimization. Neither of these approaches explicitly accounts for the uncertainty distributions of the continuous action parameters such as uncertainty in the pose estimates and in the robot’s joint configurations. SHY-COBRA instead solves the H-CSP using Pull Message Passing for Nonparametric Belief Propagation

[pmpnbp, karthikarticulated] because of its natural ability to account for the arbitrary uncertainty models of the continuous variables. The robot executes the actions in turn and replans when necessary. We demonstrate our approach on several simulated partially observable long horizon tasks in a realistic simulation environment as shown in Figure 1.

Ii Related Work

Our work focuses on the problem of planning for long horizon tasks. The class of approaches that interleave symbolic and continuous planning is called Task and Motion Planning (TAMP). Fully observable TAMP algorithms [ffrob, hpn, lagriffoul2014efficiently, tlpskeleton, lgp, garrettfactored] assume that the agent has full knowledge of its deterministic domain and that the agent’s actions have deterministic outcomes on its environment. These assumptions are however not representative of the kinds of domains robots operate in the real world, which are often stochastic and partially observable. In such domains, robots require the ability to plan in the face of incomplete knowledge and stochasticity in the effects of their actions. Relatively few methods in TAMP literature have attempted to solve these types of TAMP problems [lpk2013tampbelief], [abbeel2015modular], [phiquepal2019partial], [beliefpddlstream]. Partially observable TAMP problems are often formulated as Hybrid Constraint Satisfaction Problems[tlpskeleton] and solved either through sampling [beliefpddlstream, abbeel2015modular] or constraint-optimization methods [phiquepal2019partial]. Such approaches will often attempt to determinize the belief via the Maximum Likelihood Observation [abbeel2015modular, chadaxiomatic] rather than incorporating the entire distribution which provides richer information. Garrett et. al.(SS-Replan) [beliefpddlstream] represent and update the belief over object poses using particle filtering. However, their approach limits the scope of partial observability to that of object poses. In contrast, through the use of Nonparametric Belief Propagation [pmpnbp, karthikarticulated] on the constraint network, our approach provides the avenue for incorporating arbitrary uncertainty models on any of the variables whose value is to be inferred. We evaluate our approach against SS-Replan [beliefpddlstream] in our experiments. A number of prior works [moon, decimation] have successfully solved Constraint Satisfaction Problems using Belief Propagation. Moon et. al. [moon] formulates Sudoku Solving as a Constraint Satisfaction Problem and encodes it as a factor graph. The factors in this factor graph are row, column and 3x3 cell constraints and the variables are individual cells. They successfully infer satisfying cell values for empty cells in the sudoku puzzle. There exists much work in the domain of structuring planning problems as ones of inference [toussaintstructured, BOTVINICK2012485, attias2003planning]. Toussaint et. al. [toussaintstructured]

uses structured Dynamic Bayesian Networks to represent structured planning domain and employs loopy belief propagation to solve short-horizon reaching tasks under collision avoidance constraints with a humanoid upper body. Our approach formulates Task and Motion Planning as a Hybrid Constraint Satisfaction Problem (H-CSP) and uses Pull Message Passing Nonparametric Belief Propagation (PMPNBP)

[pmpnbp] to infer maximum joint beliefs that solve the H-CSP.

Fig. 2: Outline of our approach. {} are the symbolic effects of actions {}. {} are inferred trajectories sent to the robot controller for execution.

Iii Problem Formulation

Given an initial belief of object poses and robot joint angles, and a symbolic plan skeleton , we aim to jointly ground each symbolic action into a robot pose in configuration space, along with the robot trajectory that takes the robot from pose to . The robot can then sequentially execute the generated to achieve the end effect of each symbolic action . After each trajectory execution, the robot perceives and updates the belief , and replans if the updated belief does not satisfy the desired end effect of the corresponding symbolic action. We formulate the problem of jointly grounding a symbolic plan skeleton into a sequence of target robot poses along with the in-between trajectories as a Hybrid Constraint Satisfaction Problem [tlpskeleton], where the constraints are imposed by the desired end effects of each symbolic action in the given plan, as well as other task-agnostic constraints such as collision-free and motion cost constraints.

Fig. 3: A Constraint Network. Given a task plan {} composed of actions to pick up the pear (), and to transport the pear to the basin of a sink (), we form a factor graph that imposes constraints (rectangular nodes) as factors on the variable nodes (circular nodes) that represent object poses, robot configurations and grasp poses. In this example, represent the initial robot configuration and target robot configuration for action , respectively. represents the initial pear pose and target pear pose after the execution of action . represents a grasp pose to grasp the pear. Note that is not included in this factor graph because it has the same value as . Each target robot pose is also connected with factors that express the kinematic feasibility and collision-free constraints. We do not show these factors for clarity. The grasp stability constraint on is also not shown in this diagram. The motion constraint node connected to variable nodes and encourages to be the shortest trajectory between configurations and . PMPNBP is run on this factor graph to jointly infer satisfying action parameters.
Input: High-level Goals, , Initial Belief,
1 SymbolicPlanner () Generate a plan skeleton that achieves goal using a symbolic planner
2 Function SHY-COBRA(, ):
3       ConstraintNetwork() Convert into Constraint Network
4       Initialize () Initialize variable nodes in with
5       PMPNBP () Pass messages across using PMPNBP algorithm until convergence
6       MaxSamples () Set variable params in with the max-product assignment from corresponding nodes in
7       foreach  do
8             ExecuteAction() Receive observation
9             UpdateBelief()
10             if  ExpectedEffect()  then
11                   UpdatePlanSkel. () Update plan skeleton to reflect current state after executing
12                   SHY-COBRA()
13                   return
14                  
15             end if
16            
17       end foreach
18      return
19
End Function
Algorithm 1 SHY-COBRA

Iv Methodology

Iv-a Satisfying Hybrid Constraints with Belief Propagation (SHY-COBRA)

As described in Algorithm 1 and Figure 2, SHY-COBRA takes as input, the robot’s noisy belief of the state of the world and a plan skeleton [tlpskeleton] composed of symbolic actions. The plan skeleton is obtained by a symbolic planner that plans actions to achieve specified goal(s). Actions in this plan skeleton have free parameters like grasp poses and arm trajectories whose values are needed to be able to execute these actions in the world. We formulate the problem of inferring the values of the free parameters as a Hybrid Constraint Satisfaction Problem (H-CSP) (Section IV-B) and represent it as a constraint network as shown in Figure 3. The variable nodes in the constraint network are initialized with uniformly weighted sets of samples that represent the robot’s noisy belief of the value of the corresponding variable. For instance, a variable node representing a target object’s pose is initialized by a set of uniformly weighted poses that represent the uncertainty distribution of the pose as estimated by the robot’s noisy perception system. The H-CSP is solved by inferring the satisfying maximum joint belief of the variable nodes using a max-product version of Pull Message Passing Nonparametric Belief Propagation (PMPNBP)[pmpnbp] (Section IV-C). We then assign each free parameter in the plan skeleton with the max-product assignment of the belief of its corresponding variable node and execute the plan in the world. If there is an unexpected effect of the robot’s action while executing the plan, we update the plan skeleton and the constraint network and perform message passing using PMPNBP to infer the variable assignments for the new constraint network. This process continues until the robot successfully completes the task. The following subsections describe the components of SHY-COBRA in detail.

Iv-B Hybrid Constraint Satisfaction Problem

Finding values for action parameters in a plan skeleton that satisfy all the sets of constraints is a Hybrid Constraint Satisfaction Problem (H-CSP). The joint set of action parameters and constraints of a plan skeleton form a factor graph called a constraint network [dechter1992constraint, lagriffoul2014efficiently]. A constraint network is a factor graph with constraints as Factor nodes and action parameters as Variable nodes. Edges exist between constraints and their corresponding action parameters as depicted in Figure 3. Formally, an H-CSP is represented as

(1)

where , a subset of action parameter variable nodes that are subject to constraint factor node , is the factor graph and is the number of factors. To solve the H-CSP is to infer values for all action parameters that satisfy their corresponding constraints. In this work, we propose to solve the H-CSP by performing max-product Nonparameteric Belief Propagation on the constraint network that represents the plan skeleton. The following subsections describe the Nonparametric Belief Propagation algorithm we use, our message-passing scheme and how the values of the action parameters are inferred from noisy observations and partial knowledge of the robot’s environment.

Fig. 4: The kitchen simulation environment. The area annotated (A) is the dining area, (B) is the grocery cabinet, (C) is the cooking area and (D) is the sink area. To make a pear dinner according to an optimal plan from the SHY-COBRA planner, the robot first moves to the grocery cabinet, opens one of the drawers in the cabinet and inspects it for the pear. If the pear is not found in the drawer, the robot updates its belief of the location of the pear and replans to inspect a different drawer. If the pear is found, the robot (1) picks up the pear, (2) moves to the sink, turns on the tap and washes the pear. The robot then (3) moves to the stove and cook the pear. After cooking the pear, the robot (4) finally moves to the dining table and serves the meal.

Iv-C Solving the H-CSP using PMPNBP

To solve the H-CSP, we use PMPNBP to infer action parameter values that satisfy all the constraints in the constraint network. Mathematically, this is equivalent to inferring the action parameter assignments that maximize the joint probability

(2)

where is a normalizing constant, , a subset of action parameter variable nodes that are subject to constraint factor node and is the number of factors. In the context of our work, s include the robot pose s in configuration space as well as robot trajectories as discussed in Section III. To infer satisfying variable assignments, we perform message passing on the constraint network with max-product PMPNBP [pmpnbp]. At the beginning, the belief of each variable node is initialized with uniformly weighted samples generated by specialized generators operating on the robot’s initial belief. These generators used in this work are described in detail in Section V-C. Message passing on a constraint network involves two kinds of messages; the Constraint-to-variable message and the Variable-to-constraint message. The Constraint-to-variable message for iteration is computed as

(3)

where represents the messages from variable nodes with edges to constraint node except variable node As formulated above, samples are drawn from the belief distribution of node from the previous iteration, . The constraint function of the constraint node is then used to compute weights for each of these samples. These weighted samples, which are now the Constraint-to-variable message are then passed to node . Details on constraint functions can be found in Section IV-D. The Variable-to-constraint message for iteration is computed as

(4)

Where are all the constraint nodes with edges to the variable node except constraint node . As formulated above, to approximate the product of incoming messages, we take the union of all incoming messages from neighboring constraint nodes except constraint node and normalize their weights. The resulting messages are then resampled and passed to constraint node . One iteration of message passing on the constraint network follows the following sequence:

  1. Pass Variable-to-constraint messages from all variables to their corresponding constraints

  2. Pass Constraint-to-variable messages from all constraints to their corresponding variables

After each iteration of message passing, the belief of each variable node is updated by taking the union of all incoming messages to the variable node, normalizing their weights and resampling a new set to represent the belief of the variable node. Message passing is performed for several iterations until the maximum joint belief of each variable node converges. After convergence, each action parameter is assigned with the max-product assignment of the belief of its variable node.

Iv-D Constraints and Constraint functions

Iv-D1 Constraints

The constraints used in this work are

  • Motion constraints that enforce that is the shortest trajectory from one robot configuration to another robot configuration

  • Kin constraints that enforce kinematic feasibility of the robot in configuration holding an object with grasp

  • CfreeH constraints that enforce that when the robot is holding an object in grasp , the trajectory that the object is moved through is collision free

  • GraspH constraints that enforce that a grasp is a stable grasp pose

  • grasp constraints that enforce that at a configuration , when a robot picks an object at position with grasp , the grasp will be feasible at a later time when the robot at configuration places the object at position .

  • Stable constraints that enforce that a placement pose of an object is stable and won’t cause the object to fall off.

  • inBasin constraints that enforce that the object is placed in a stable pose in the basin

  • inSaucepan constraints that enforce that the object is placed in a stable pose on the saucepan.

Iv-D2 Constraint functions

A constraint function assigns weights to samples drawn from the belief distribution of the target variable node when the Constraint-to-variable message is computed. The weight of each sample drawn from a variable node is computed as follows:

(5)

where is the weight of sample drawn from variable node , is the constraint function of constraint node , is the number of variable nodes connected by an edge to the constraint node and are the highest weighted samples from messages received from the other variable nodes connected to . Each type of constraint node has a unique constraint function for assessing the weight of a sample. Constraint function example: Consider a collision-free constraint node sending a message to an arm-trajectory variable node . The collision-free constraint node has edges to both an arm-trajectory variable node and a grasp pose variable node. To compute the Constraint-to-variable message to be sent to the arm-trajectory variable node, samples are first drawn from the belief of the arm-trajectory variable node. To weight each arm-trajectory sample, the constraint function takes as inputs the sample arm-trajectory and the highest weighted grasp pose sample . then computes the weight of the sample arm-trajectory as

(6)

where is the computed weight of sample , and are user-defined constants, is a routine that computes the cumulative distance along from obstacles in its environment and is a routine that computes the distance between the end-effector pose after travelling along and the highest weighted grasp pose . The weights for all arm-trajectory samples are computed and normalized, constituting the Constraint-to-variable message sent to the arm-trajectory variable node. Some other constraint functions used in this work are the

  • Kinematic constraint function (), which weights robot joint configuration samples and grasp poses based on how kinematically feasible a joint configuration of the robot’s arm is if it is holding an object with a particular grasp pose.

  • Stable constraint function (), which weights placement poses based on how stable they are. i.e. how geometrically stable an object placed on a surface in a specific pose is.

  • Grasp constraint function (), which weights robot configurations, initial object poses and grasp poses based on how well they are jointly feasible and how well they satisfy later target object poses.

  • Motion constraint function (), which weights trajectories, initial configuration and final configuration based on how short the trajectory from the initial configuration to the final configuration is.

  • Grasp stability constraint function (), which weights grasp poses based on how geometrically stable they are.

  • inBasin constraint function, which weights placement poses in the basin based on how stable they are.

  • inSaucepan constraint function, which weights placement poses in the saucepan based on how stable they are.

V Implementation

V-a Action Schemas

Actions that make up a plan are described by continuous action parameters, constraints, preconditions and effects. preconditions are the conditions that need to be True before an action can be executed. effects describe the changes in a subset of the state after an action is executed. Continuous action parameters are the continuous values needed by the robot to execute the action in the world. These include object poses, grasps, robot configurations and trajectories. constraints must hold true for all continuous parameters for the action to be executed successfully in the world. We used Fast Downward [fd] as the symbolic planner for planning action schemas. The types of constraints used in the composition of action schemas in this work are described in Section IV-D. Some examples of action schemas [actionschema] used in this work are as follows: (:action pick[obj]
:parameters (, p, g, )
:constraints CFree(), Stable[obj](p), GraspH[obj](g) Kin[obj](, p, g)
:preconditions (and (at robot ) (at obj p) (handempty))
:effects (and (holding obj) (not (handempty)))
) (:action place[obj]
:parameters (, p, g, )
:constraints CFree(), Stable[obj](p), GraspH[obj](g) Kin[obj](, p, g)
:preconditions (and (holding obj) (at robot ) (at obj p))
:effects (and (not holding obj) (handempty))
) (:action wash[obj]
:parameters (p)
:constraints Stable[obj](p), InBasin[obj](p)
:preconditions (at obj p)
:effects (clean obj)
)
where obj represents the target object and , p, g, represent robot joint-configuration, object pose, grasp and joint trajectory respectively.

[]   []

Fig. 5: (a) Noisy pose estimate of a single pear in the cabinet drawer. (b) Noisy joint configuration estimate of the robot’s right arm. Each pear in (a) and arm configuration in (b) represents a likely pose of the pear or arm joint configuration sampled from their respective estimated noise distributions.

V-B Uncertainty sources and uncertainty distributions

The major advantage of SHY-COBRA over Garrett et. al. [beliefpddlstream]

is the ability of SHY-COBRA to concurrently and seamlessly incorporate arbitrary uncertainty sources and distributions. In our experiments, we consider pose estimation uncertainty and robot arm joint-configuration uncertainty as depicted in Figure

5

. We assume that the uncertainties in pose estimation and joint configuration estimation are Gaussian distributed with zero means and variances

and respectively. It is worth noting that SHY-COBRA is agnostic to the type of uncertainty distribution of an action parameter variable. The robot is equipped with a perception system that updates the robot’s belief after every action.

V-C Computing initial samples

We compute initial samples for the various free variables by using their corresponding specialized generators. Each free variable comes with a generator that computes samples for the free variable given the initial belief. Consider an action that picks up a target object, as described in Section V-A above. This action takes as parameters a pose variable, a grasp variable, an arm-trajectory variable and a joint configuration variable. The initial samples for the pose variable node in the corresponding plan skeleton consist of uniformly weighted poses received from the robot’s perception system’s noisy estimation of the pose of the object.
Likewise, the initial grasp samples are also generated by a grasp generator which computes valid uniformly weighted grasps of the object at each of the initial sample poses. The initial arm-trajectory samples are generated by a specialized arm-trajectory generator equipped with the RRT-Connect [rrtconnect] motion planner. This generator generates uniformly-weighted joint trajectories to each of the grasp samples. Finally, to generate robot joint configuration samples, the joint-configuration generator, which is equipped with the IKFast [ikfast] inverse kinematics solver, generates joint configuration samples to each of the grasp samples. With each of these joint configuration samples as a mean, we further sample sub joint configurations from the Gaussian distributed joint configuration noise with variance . During message passing, the weight of each sample is computed as

where is the joint configuration sampled from the Gaussian distribution with mean and variance , is the Gaussian probability of , is the constraint function of the constraint node connected to and are the highest weighted samples from messages received from the other variable nodes connected to .

Vi Experiments

We performed experiments on 12 randomly generated problems for 4 different tasks as described in Section VI-A. The experiments were performed with a simulated Digit robot [digit] with the PyBullet simulation software [pybullet] as shown in Figure 4. We used IKFast [ikfast] to compute inverse kinematics solutions for the robot arms and used the pybullet_planning package [pyplan] for motion planning. We quantitatively compare SHY-COBRA with SS-Replan*, a variation of the SS-Replan algorithm [beliefpddlstream] which uses off-the-shelf RRT-Connect [rrtconnect] and IKFast [ikfast] for motion planning and inverse kinematics respectively. For each cycle of planning, we generate 100 samples from the uncertainty distributions of object pose estimate and 100 samples from the joint configuration estimate. We run PMPNBP for 10 iterations on each cycle. The pose estimate noise is Gaussian distributed with zero mean and

10cm standard deviation. The joint configuration estimate noise is Gaussian distributed with zero mean and

0.25 radians standard deviation. The experiments were run on a laptop with 2.21GHz Intel Core i7 CPU, 32GB RAM and a GTX 1070 GPU.

Vi-a Tasks

We evaluated SHY-COBRA and SS-Replan* on 12 randomly generated problems of 4 different tasks. See the accompanying video for demonstrations of the tasks in simulation. The tasks are described as follows:

Vi-A1 Retrieve

The high-level goal for this task is to retrieve the pear. The prior location of the pear is uniformly distributed across the 3 grocery drawers. Because of occlusion and poor lighting in the drawer, the robot has to deal with the noisy estimates of the pose of the pear as well. A successful plan opens a drawer at random and inspects it. If the pear is located, its done. Else, it updates its belief of the location of the pear and repeats the process until the pear is located. It then picks up the pear.

Vi-A2 Wash

The high-level goal for this task is to retrieve the pear and wash it. This task has the same prior belief as the task above. A successful plan performs the Retrieve task as described above, sends the pear to the sink and turns on the tap to wash it.

Vi-A3 Cook

The high-level goal for this task is to retrieve the pear, wash it and cook it. The main challenge in this task is to infer grasps and trajectories that allow the robot to pick up a cup in a specific grasp pose that makes later actions like filling the cup with water and pouring the water in the saucepan feasible during the execution of the cooking task. This task has the same prior belief as the tasks above. A successful plan performs the Wash task as described above, takes the pear to the stove, puts it in the saucepan, picks up a cup, fills it with water, pours the water in the saucepan and presses the cook button on the stove to cook the pear.

Vi-A4 Serve-meal

The high-level goal for this task is to retrieve the pear, wash it, cook it and serve it. This task has the same prior belief as the tasks above. A successful plan performs the Cook task as described above, picks up the cooked pear from the saucepan and sets it on a tray, carries the tray to the dining table and distributes the cooked pear to the plates on the table.

Vi-B Results

Table I shows the experimental results for SHY-COBRA and SS-Replan* on the tasks described above.

Alg: SHY-COBRA SS-Replan*
Task: Planning Time N.E Planning Time N.E
Retrieve
Wash
Cook
Serve-meal
TABLE I: Results from evaluation of SHY-COBRA and SS-Replan*. The table shows the mean planning duration in seconds (Planning Time) and the number of errors (N.E) for all 12 randomly generated problems for the 4 tasks
Fig. 6: Plots comparing the planning times and number of errors of SHY-COBRA and SS-Replan* for increasing uncertainties in the pose estimation and joint configuration when performing the Retrieve task. represents the standard deviation of the zero mean object pose estimation noise and represents the standard deviation of the zero mean joint configuration estimation noise

An error occurs when the robot misses its target when picking or placing an object due to noise in the object’s pose or joint configuration estimate. Based on the results, SHY-COBRA was consistently more efficient than SS-Replan* across all 4 tasks and made slightly less errors across all 4 tasks. Since SS-Replan* is only capable of considering object pose uncertainty and doesn’t account for the uncertainty in the robot arm’s joint configurations, it misses its target more often and as a result makes more errors and takes longer to plan. In spite of its ability to concurrently incorporate arbitrary uncertainty sources and distributions due to the use of PMPNBP for inference, SHY-COBRA has a time complexity that grows with the magnitude of noise in the estimation. We demonstrate this by comparing the planning times for SHY-COBRA and SS-Replan* for increasing noise in object pose estimation and joint configuration estimation as indicated in the results in Figure 6.

Vii Conclusion

We proposed a planning approach for long horizon planning under uncertainty. Our approach jointly infers satisfying action parameter values for a plan skeleton that are needed to successfully execute the plan in a stochastic, partially observable environment. Our approach outperformed an adaption of the SS-Replan algorithm across all tasks in our experiments.

References