RMPflow: A Computational Graph for Automatic Motion Policy Generation

11/16/2018 ∙ by Ching-An Cheng, et al. ∙ 0

We develop a novel policy synthesis algorithm, RMPflow, based on geometrically consistent transformations of Riemannian Motion Policies (RMPs). RMPs are a class of reactive motion policies designed to parameterize non-Euclidean behaviors as dynamical systems in intrinsically nonlinear task spaces. Given a set of RMPs designed for individual tasks, RMPflow can consistently combine these local policies to generate an expressive global policy, while simultaneously exploiting sparse structure for computational efficiency. We study the geometric properties of RMPflow and provide sufficient conditions for stability. Finally, we experimentally demonstrate that accounting for the geometry of task policies can simplify classically difficult problems, such as planning through clutter on high-DOF manipulation systems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In this work, we develop a new motion generation and control framework that enables globally stable controller design within intrinsically111An intrinsically non-Euclidean space is one which is defined by a non-constant Riemannian metric with non-trivial curvature. non-Euclidean spaces. Non-Euclidean geometries are not often modeled explicitly in robotics, but are nonetheless common in the natural world. One important example is the apparent non-Euclidean behavior of obstacle avoidance. Obstacles become holes in this setting. As a result, straight lines are no longer a reasonable definition of shortest distance—geodesics must, therefore, naturally flow around them. This behavior implies a form of non-Euclidean geometry: the space is naturally curved by the presence of obstacles.

The planning literature has made substantial progress in modeling non-Euclidean task-space behaviors, but at the expense of efficiency and reactivity. Starting with early differential geometric models of obstacle avoidance [1] and building toward modern planning algorithms and optimization techniques [2, 3, 4, 5, 6, 7, 8, 9], these techniques can calculate highly nonlinear trajectories. However, they are often computationally intensive, sensitive to noise, and unresponsive to perturbation. In addition, the internal nonlinearities of robots due to kinematic constraints are sometimes simplified in the optimization.

At the same time, a separate thread of literature, emphasizing fast reactive control over computationally expensive planning, developed efficient closed-loop control techniques such as Operational Space Control (OSC) [10]. But while these techniques account for internal geometries from the robot’s kinematic structure, they assume simple Euclidean geometry in task spaces [11, 12], failing to provide a complete treatment of the external geometries. As a result, obstacle avoidance, e.g., has to rely on extrinsic potential functions, leading to undesirable deacceleartion behavior when the robot is close to the obstacle. If the non-Euclidean geometry can be intrinsically considered, then fast obstacle avoidance motion would naturally arise as traveling along the induced geodesic. The need for a holistic solution to motion generation and control has motivated a number of recent system architectures tightly integrating planning and control [13, 14].

We develop a new approach to synthesizing control policies that can accommodate and leverage the modeling capacity of intrinsically non-Euclidean robotics tasks. Taking inspiration from Geometric Control Theory [15],222 See Appendix A.1 for a discussion of why geometric mechanics and geometric control theory constitute a good starting point. we design a novel recursive algorithm, RMPflow, based on a recently proposed mathematical object for representing nonlinear policies known as the Riemannian Motion Policy (RMP) [16]. This algorithm enables the geometrically consistent fusion of many component policies defined across non-Euclidean task spaces that are related through a tree structure. We show that RMPflow, which generates behavior by calculating how the robot should accelerate, mimics the Recursive Newton-Euler algorithm [17] in structure, but generalizes it beyond rigid-body systems to a broader class of highly-nonlinear transformations and spaces.

In contrast to existing frameworks, our framework naturally models non-Euclidean task spaces with Riemannian metrics that are not only configuration dependent, but also velocity dependent. This allows RMPflow to consider, e.g., the direction a robot travels to define the importance weights in combing policies. For example, an obstacle, despite being close to the robot, can usually be ignored if robot is heading away from it. This new class of policies leads to an extension of Geometric Control Theory, building on a new class of non-physical mechanical systems we call Geometric Dynamical Systems (GDS).

We also show that RMPflow is Lyapunov-stable and coordinate-free. In particular, when using RMPflow, robots can be viewed each as different parameterizations of the same task space, defining a precise notion of behavioral consistency between robots. Additionally, under this framework, the implicit curvature arising from non-constant Riemannian metrics (which may be roughly viewed as position-velocity dependent inertia matrices in OSC) produces nontrivial and intuitive policy contributions that are critical to guaranteeing stability and generalization across embodiments. Our experimental results illustrate how these curvature terms can be impactful in practice, generating nonlinear geodesics that result in curving or orbiting around obstacles. Finally, we demonstrate the utility of our framework with a fully reactive real-world system on multiple dual-arm manipulation problems.

2 Motion Generation and Control

Motion generation and control can be formulated as the problem of transforming curves from the configuration space to the task space . Specifically, let be a -dimensional smooth manifold. A robot’s motion can be described as a curve such that the robot’s configuration at time is a point . Without loss of generality, suppose has a global coordinate , called the generalized coordinate; for short, we would identify the curve with its coordinate and write as . A typical example of the generalized coordinate is the joint angles of a

-DOF (degrees-of-freedom) robot: we denote

as the joint angles at time and , as the joint velocities and accelerations. To describe the tasks, we consider another manifold , the task space, which is related to the configuration space through a smooth task map . The task space can be the end-effector position/orientation [10, 18], or more generally can be a space that describes whole-body robot motion, e.g., in simultaneous tracking and collision avoidance [19, 20]. The goal of motion generation and control is to design the curve so that the transformed curve exhibits desired behaviors on the task space .

Notation

For clarity, we use boldface to distinguish the coordinate-dependent representations from abstract objects; e.g. we write and . In addition, we will often omit the time- and input-dependency of objects unless necessary; e.g. we may write and . For derivatives, we use both symbols and , with a transpose relationship: for and a differential map , we write . For a matrix , we denote as its th column and as its element. To compose a matrix, we use for vertical (or matrix) concatenation and for horizontal concatenation. For example, we write and . We use and to denote the symmetric, positive semi-definite/definite matrices, respectively.

2.1 Motion Policies and the Geometry of Motion

We model motion as a second-order differential equation333 We assume the system has been feedback linearized. A torque-based setup can be similarly derived by setting the robot inertia matrix as the intrinsic metric on  [11]. of , where we call a motion policy and the state. In contrast to an open-loop trajectory, which forms the basis of many motion planners, a motion policy expresses the entire continuous collection of its integral trajectories and therefore is robust to perturbations. Motion policies can model many adaptive behaviors, such as reactive obstacle avoidance [21, 13] or responses driven by planned Q-functions [22], and their second-order formulation enables rich behavior that cannot be realized by the velocity-based approach [23].

The geometry of motion has been considered by many planning and control algorithms. Geometrical modeling of task spaces is used in topological motion planning [3], and motion optimization has leveraged Hessian to exploit the natural geometry of costs [24, 5, 25, 26]. Ratliff et al. [2], e.g., use the workspace geometry inside a Gauss-Newton optimizer and generate natural obstacle-avoiding reaching motion through traveling along geodesics of curved spaces.

Geometry-aware motion policies were also developed in parallel in controls. OSC is the best example [10]. Unlike the planning approaches, OSC focuses on the internal geometry of the robot and considers only simple task-space geometry. It reshapes the workspace dynamics into a simple spring-mass-damper system with a constant inertia matrix, enforcing a form of Euclidean geometry in the task space. Variants of OSC have been proposed to consider different metrics [27, 11, 20], task hierarchies [19, 28], and non-stationary inputs [29].

While these algorithms have led to many advances, we argue that their isolated focus on either the internal or the external geometry limits the performance. The planning approach fails to consider reactive dynamic behavior; the control approach cannot model the effects of velocity dependent metrics, which are critical to generating sensible obstacle avoidance motions, as discussed in the introduction. While the benefits of velocity dependent metrics was recently explored using RMPs [16], a systematic understanding is still an open question.

3 Automatic Motion Policy Generation with RMPflow

RMPflow is an efficient manifold-oriented computational graph for automatic generation of motion policies. It is aimed for problems with a task space that is related to the configuration space through a tree-structured task map , where is the th subtask. Given user-specified motion policies on as RMPs, RMPflow is designed to consistently combine these subtask policies into a global policy on . To this end, RMPflow introduces 1) a data structure, called the RMP-tree, to describe the tree-structured task map and the policies, and 2) a set of operators, called the RMP-algebra, to propagate information across the RMP-tree. To compute at time , RMPflow operates in two steps: it first performs a forward pass to propagate the state from the root node (i.e. ) to the leaf nodes (i.e. ); then it performs a backward pass to propagate the RMPs from the leaf nodes to the root node. These two steps are realized by recursive use of RMP-algebra, exploiting shared computation paths arising the tree structure to maximize efficiency.

3.1 Structured Task Maps

(a)
(b)
(c)
Figure 1: Tree-structured task maps

In most cases, the task-space manifold is structured. In this paper, we consider the case where the task map can be expressed through a tree-structured composition of transformations , where is the th transformation. Fig. 1 illustrates some common examples. Each node denotes a manifold and each edge denotes a transformation. This family trivially includes the unstructured task space (Fig. 1a) and the product manifold (Fig. 1b), where is the number of subtasks. A more interesting example is the kinematic tree (Fig. 1c), where, e.g., the subtask spaces on the leaf nodes can describe the tracking and obstacle avoidance tasks along a multi-DOF robot.

The main motivation of explicitly handling the structure in the task map is two-fold. First, it allows RMPflow to exploit computation shared across different subtask maps. Second, it allows the user to focus on designing motion policies for each subtask individually, which is easier than directly designing a global policy for the entire task space . For example, may describe the problem of humanoid walking, which includes staying balanced, scheduling contacts, and avoiding collisions. Directly parameterizing a policy to satisfy all these objectives can be daunting, whereas designing a policy for each subtask is more feasible.

3.2 Riemannian Motion Policies (RMPs)

Knowing the structure of the task map is not sufficient for consistently combining subtask policies: we require some geometric information about the motion policies’ behaviors [16]. Toward this end, we adopt an abstract description of motion policies, called RMPs [16], for the nodes of the RMP-tree. Specifically, let be an -dimensional manifold with coordinate . The canonical form of an RMP on is a pair , where is a continuous motion policy and is a differentiable map. Borrowing terminology from mechanics, we call the desired acceleration and the inertia matrix at , respectively.444Here we adopt a slightly different terminology from [16]. We note that and do not necessarily correspond to the inertia and force of a physical mechanical system. defines the directional importance of when it is combined with other motion policies. Later in Section 4, we will show that is closely related to Riemannian metric, which describes how the space is stretched along the curve generated by ; when depends on the state, the space becomes non-Euclidean. We additionally introduce a new RMP form, called the natural form. Given an RMP in its canonical form , the natural form is a pair , where is the desired force map. While the transformation between these two forms may look trivial, their distinction will be useful later when we introduce the RMP-algebra.

3.3 RMP-tree

The RMP-tree is the core data structure used by RMPflow. An RMP-tree is a directed tree, in which each node represents an RMP and its state, and each edge corresponds to a transformation between manifolds. The root node of the RMP-tree describes the global policy on , and the leaf nodes describe the local policies on . To illustrate, let us consider a node and its child nodes . Suppose describes an RMP and describes an RMP , where for some . Then we write and ; the edge connecting and points from to along . We will continue to use this example to illustrate how RMP-algebra propagates the information across the RMP-tree.

3.4 RMP-algebra

The RMP-algebra consists of three operators (pushforward, pullback, and resolve) to propagate information.555Precisely it propagates the numerical values of RMPs and states at a particular time. They form the basis of the forward and backward passes for automatic policy generation, described in the next section.

  1. pushforward is the operator to forward propagate the state from a parent node to its child nodes. Using the previous example, given from , it computes for each child node , where

    is a Jacobian matrix. The name “pushforward” comes from the linear transformation of tangent vector

    to the image tangent vector .

  2. pullback is the operator to backward propagate the natural-formed RMPs from the child nodes to the parent node. It is done by setting with

    (1)

    The name “pullback” comes from the linear transformations of the cotangent vector (1-form) and the inertia matrix (2-form) . In summary, velocities can be pushfowarded along the direction of , and forces and inertial matrices can be pullbacked in the opposite direction.

    To gain more intuition of pullback, we write pullback in the canonical form of RMPs. It can be shown that the canonical form of the natural form above is the solution to a least-squared problem:

    (2)

    where and . Because , pullback attempts to find an that can realize the desired accelerations while trading off approximation errors with an importance weight defined by the inertia matrix . The use of state dependent importance weights is a distinctive feature of RMPflow. It allows RMPflow to activate different RMPs according to both configuration and velocity (see Section 3.6 for examples). Finally, we note that the pullback operator defined in this paper is slightly different from the original definition given in [16], which ignores the term in (2). While ignoring does not necessary destabilize the system [20], its inclusion is critical to implement consistent policy behaviors.

  3. resolve is the last operator of RMP-algebra. It maps an RMP from its natural form to its canonical form. Given , it outputs with , where denotes Moore-Penrose inverse. The use of pseudo-inverse is because in general the inertia matrix is only positive semi-definite. Therefore, we also call the natural form of the unresolved form, as potentially it can be realized by multiple RMPs in the canonical form.

3.5 Algorithm: Motion Policy Generation

Now we show how RMPflow uses the RMP-tree and RMP-algebra to generate a global policy on . Suppose each subtask policy is provided as an RMP. First, we construct an RMP-tree with the same structure as , where we assign subtask RMPs as the leaf nodes and the global RMP as the root node. With the RMP-tree specified, RMPflow can perform automatic policy generation. At every time instance, it first performs a forward pass: it recursively calls pushforward from the root node to the leaf nodes to update the state information in each node in the RMP-tree. Second, it performs a backward pass: it recursively calls pullback from the leaf nodes to the root node to back propagate the values of the RMPs in the natural form, and finally calls resolve at the root node to transform the global RMP into its canonical form for policy execution (i.e. setting ).

The process of policy generation of RMPflow uses the tree structure for computational efficiency. For subtasks, it has time complexity in the worst case as opposed to of a naive implementation which does not exploit the tree structure. Furthermore, all computations of RMPflow are carried out using matrix-multiplications, except for the final resolve call, because the RMPs are expressed in the natural form in pullback instead of the canonical form suggested originally in [16]. This design makes RMPflow numerically stable, as only one matrix inversion is performed at the root node with both and in the span of the same Jacobian matrix due to pullback.

3.6 Example RMPs

We give a quick overview of some RMPs useful in practice (a complete discussion of these RMPs are postponed to Appendix D). We recall from (2) that dictates the directional importance of an RMP.

3.6.1 Collision/joint limit avoidance

Barrier-type RMPs are examples that use velocity dependent inertia matrices, which can express importance as a function of robot heading (a property that traditional mechanical principles fail to capture). Here we demonstrate a collision avoidance policy in the 1D distance space to an obstacle. Let for some functions and . We consider a motion policy such that and define its inertia matrix , where is a potential and is a damper. We choose to increase as decreases (close to the obstacle), to increase when (moving toward the obstacle), and to be constant when . With this choice, the RMP can be turned off in pullback when the robot heads away from the obstacle. This motion policy is a GDS and is its metric (cf. Section 4.1); the terms and are due to non-Euclidean geometry and produce natural repulsive behaviors.

3.6.2 Target attractors

Designing an attractor policy is relatively straightforward. For a task space with coordinate , we can consider an inertia matrix and a motion policy such that , where is a smooth attractor potential, is a damper, and is a curvature term. It can be shown that this differential equation is also a GDS (see Appendix D.4).

3.6.3 Orientations

As RMPflow directly works with manifold objects, orientation controllers become straightforward to design, independent of the choice of coordinate (cf. Section 4.4). For example, we can define RMPs on a robotic link’s surface in any preferred coordinate (e.g. in one or two axes attached to an arbitrary point) with the above described attractor to control the orientation. This follows a similar idea outlined in the Appendix of [16].

3.6.4 Q-functions

Perhaps surprising, RMPs can be constructed using Q-functions as metrics (we invite readers to read [16] for details on how motion optimizers can be reduced to Q-functions and the corresponding RMPs). While these RMPs may not satisfy the conditions of a GDS that we later analyze, they represent a broader class of RMPs that leads to substantial benefits (e.g. escaping local minima) in practice. Also, Q-functions are closely relate to Lyapunov functions and geometric control [30]; we will further explore this direction in future work.

4 Theoretical Analysis of RMPflow

We investigate the properties of RMPflow when the child-node motion policies belong to a class of differential equations, which we call structured geometric dynamical systems (structured GDSs). We present the following results.

  1. Closure: We show that the pullback operator retains a closure of structured GDSs. When the child-node motion policies are structured GDSs, the parent-node dynamics also belong to the same class.

  2. Stability: Using the closure property, we provide sufficient conditions for the feedback policy of RMPflow to be stable. In particular, we cover a class of dynamics with velocity-dependent metrics that are new to the literature.

  3. Invariance: As its name suggests, RMPflow is closely related to differential geometry. We show that RMPflow is intrinsically coordinate-free. This means that a set of subtask RMPs designed for one robot can be transferred to another robot while maintaining the same task-space behaviors.

Setup

We assume all that manifolds and maps are sufficiently smooth. For now, we assume also that each manifold has a single chart; the coordinate-free analysis is postponed to Section 4.4. All the proofs are provided in Appendix B.

4.1 Geometric Dynamical Systems (GDSs)

We define a family of dynamics useful to specify RMPs on manifolds. Let manifold be -dimensional with chart . Let , , and . The tuple is called a GDS if and only if

(3)

where , , and . We refer to as the metric matrix, as the damping matrix, and as the potential function which is lower-bounded. In addition, we define as the inertia matrix, which can be asymmetric. We say a GDS is non-degenerate if is nonsingular. We will assume (3) is non-degenerate so that it uniquely defines a differential equation and discuss the general case in Appendix A. induces a metric of , measuring its length as . When depends on and , it also induces the curvature terms and . In a particular case when , the GDSs reduce to the widely studied simple mechanical systems (SMSs) [15], ; in this case and the Coriolis force is equal to . The extension to velocity-dependent is important and non-trivial. As discussed in Section 3.6, it generalizes the dynamics of classical rigid-body systems, allowing the space to morph according to the velocity direction.

As its name suggests, GDSs possess geometric properties. Particularly, when is invertible, the left-hand side of (3) is related to a quantity , known as the geometric acceleration (cf. Section 4.4). In short, we can think of (3) as setting along the negative natural gradient while imposing damping .

4.2 Closure

Earlier, we mentioned that by tracking the geometry in pullback in (1), the task properties can be preserved. Here, we formalize the consistency of RMPflow as a closure of differential equations, named structured GDSs. Structured GDSs augment GDSs with information on how the metric matrix factorizes. Suppose has a structure that factorizes , where and , and . We say the tuple is a structured GDS if and only if

(4)

where . Note the metric and factorization in combination defines . As a special case, GDSs are structured GDSs with a trivial structure (i.e. ). Also, structured GDSs reduce to GDSs (i.e. the structure offers no extra information) if , or if (cf. Appendix B.1). Given two structures, we say preserves if has the factorization (of ) made by . In Section 4.4, we will show that structured GDSs are related to a geometric object, pullback connection, which turns out to be the coordinate-free version of pullback.

To show the closure property, we consider a parent node on with child nodes on . We note that and can be functions of both and . [] Let the th child node follow and have coordinate . Let and . If of the parent node is given by pullback with and is non-singular, the parent node follows , where , , , preserves , and . Particularly, if is velocity-free and the child nodes are GDSs, the parent node follows . Theorem 4.2 shows structured GDSs are closed under pullback. It means that the differential equation of a structured GDS with a tree-structured task map can be computed by recursively applying pullback from the leaves to the root. If all leaf nodes follow GDSs and at the root node is nonsingular, then the root node follows as recursively defined by Theorem 4.2.

4.3 Stability

By the closure property above, we analyze the stability of RMPflow when the leaf nodes are (structured) GDSs. For compactness, we will abuse the notation to write . Suppose is nonsingular and let be the resultant structured GDS at the root node. We consider a Lyapunov candidate and derive its rate using properties of structured GDSs. [] For , . Proposition 4.3 directly implies the stability of structured GDSs by invoking LaSalle’s invariance principle [31]. Here we summarize the result without proof. For , if , the system converges to a forward invariant set . To show the stability of RMPflow, we need to further check when the assumptions in Corollary 4.3 hold. The condition is easy to satisfy: by Theorem 4.2, ; to strictly ensure definiteness, we can copy into an additional child node with a (small) positive-definite damping matrix. The condition on can be satisfied similarly. In addition, we need to verify the assumption that is nonsingular. Here we provide a sufficient condition. When satisfied, it implies the global stability of RMPflow. [] Suppose every leaf node is a GDS with a metric matrix in the form for differentiable functions , , and satisfying , , and , where is the coordinate of the leaf-node manifold and . It holds . If further , then , and the global RMP generated by RMPflow converges to the forward invariant set in Corollary 4.3. A particular condition in Theorem 4.3 is when all the leaf nodes with velocity dependent metric are 1D. Suppose is its coordinate and is its metric matrix. The sufficient condition essentially boils down to and . This means that, given any , , is non-decreasing when , and non-increasing when . This condition is satisfied by the collision avoidance policy in Section 3.6.

4.4 Invariance

We now discuss the coordinate-free geometric properties of generated by RMPflow. Due to space constraint, we only summarize the results (please see Appendix B.4 and, e.g., [32]). Here we assume that is positive-definite.

We first present the coordinate-free version of GDSs (i.e. the structure is trivial) by using a geometric object called affine connection, which defines how tangent spaces on a manifold are related. Let denote the tangent bundle of , which is a natural manifold to describe the state space. We first show that a GDS on can be written in terms of a unique, asymmetric affine connection that is compatible with a Riemannian metric (defined by ) on . It is important to note that is defined on not the original manifold . As the metric matrix in a GDS can be velocity dependent, we need a larger manifold.

[] Let be a Riemannian metric on such that, for , , where and are symmetric and positive-definite, and is differentiable. Then there is a unique affine connection that is compatible with and satisfies, , , and , for and . In coordinates, if is identified as , then can be written as , where is a projection. We call the geometric acceleration of with respect to . It is a coordinate-free object, because is defined independent of the choice of chart of . By Theorem 4.4, it is clear that a GDS can be written abstractly as , where defines the covectors due to the potential function and damping, and denotes the inverse of . In coordinates, it reads as , which is exactly (3).

Next we present a coordinate-free representation of RMPflow. [] Suppose is related to leaf-node task spaces by maps and the th task space has an affine connection on , as defined in Theorem 4.4, and a covector function defined by some potential and damping as described above. Let be the pullback connection, be the pullback metric, and be the pullback covector, where . Then is compatible with , and can be written as . In particular, if is velocity-independent, then . Theorem 4.4 says that the structured GDS can be written abstractly, without coordinates, using the pullback of task-space covectors, metrics, and asymmetric affine connections (that are defined in Theorem 4.4). In other words, the recursive calls of pullback in the backward pass of RMPflow is indeed performing “pullback” of geometric objects. Theorem 4.4 also shows, when is velocity-independent, the pullback of connection and the pullback of metric commutes. In this case, , which is equivalent to the Levi-Civita connection of . The loss of commutativity in general is due to the asymmetric definition of the connection in Theorem 4.4, which however is necessary to derive a control law of acceleration, without further referring to higher-order time derivatives.

4.5 Related Approaches

While here we focus on the special case of RMPflow with GDSs, this family already covers a wide range of reactive policies commonly used in practice. For example, when the task metric is Euclidean (i.e. constant), RMPflow recovers OSC (and its variants) [10, 19, 11, 12, 20]. When the task metric is only configuration dependent, RMPflow can be viewed as performing energy shaping to combine multiple SMSs in geometric control [15]. Further, RMPflow allows using velocity dependent metrics, generating behaviors all those previous rigid mechanics-based approaches fail to model. We also note that RMPflow can be easily modified to incorporate exogenous time-varying inputs (e.g. forces to realize impedance control [18] or learned perturbations as in DMPs [29]). In computation, the structure of RMPflow in natural-formed RMPs resembles the classical Recursive Newton-Euler algorithm [17, 33] (see Appendix C). Alternatively, the canonical form of RMPflow in (2) resembles Gauss’ Principle [11, 12], but with a curvature correction on the inertia matrix (suggested by Theorem 4.2) to account for velocity dependent metrics. Thus, we can view RMPflow as a natural generalization of these approaches to a broader class of non-Euclidean behaviors.

5 Experiments

We perform controlled experiments to study the curvature effects of nonlinear metrics, which is important for stability and collision avoidance. We then perform several full-body experiments (video: https://youtu.be/aFJMxfWV760) to demonstrate the capabilities of RMPflow on high-DOF manipulation problems in clutter, and implement an integrated vision-and-motion system on two physical robots.

5.1 Controlled Experiments

(a)
(b)
(c)
(d)
Figure 2: Phase portraits (gray) and integral curves (blue; from black circles to red crosses) of 1D example. (a) Desired behavior. (b) With curvature terms. (c) Without curvature terms. (d) Without curvature terms but with nonlinear damping.

5.1.1 1D Example

Let . We consider a barrier-type task map and define a GDS in (3) with , , and , where . Using the GDS, we can define an RMP , where and are defined according to Section 4.1. We use this example to study the effects of in pullback (1), where we define . Fig. 2 compares the desired behavior (Fig. (a)a) and the behaviors of correct/incorrect pullback. If pullback is performed correctly with , the behavior matches the designed one (Fig. (b)b). By contrast, if is ignored, the observed behavior becomes inconsistent and unstable (Fig. (c)c). While the instability of neglecting can be recovered with a damping nonlinear in (suggested in [20]), the behavior remains inconsistent (Fig. (d)d).

(a)
(b)
(c)
(d)
(e)
Figure 3: 2D example; initial positions (small circle) and velocities (arrows). (a-d) Obstacle (circle) avoidance: (a) w/o curvature terms and w/o potential. (b) w/ curvature terms and w/o potential. (c) w/o curvature terms and w/ potential. (d) w/ curvature terms and w/ potential. (e) Combined obstacle avoidance and goal (square) reaching.
Figure 4: Results for reaching experiments. Though some methods achieve a shorter goal distance than RMPflow in successful trials, they end up in collision in most the trials.

5.1.2 2D Example

We consider a 2D goal-reaching task with collision avoidance and study the effects of velocity dependent metrics. First, we define an RMP (a GDS as in Section 3.6) in (the 1D task space of the distance to the obstacle). We pick a metric , where increases if the particle is close to the obstacle and (where ), increases if it moves towards the obstacle. As this metric is non-constant, the GDS has curvature terms and . These curvature terms along with produce an acceleration that lead to natural obstacle avoidance behavior, coaxing the system toward isocontours of the obstacle (Fig. (b)b). On the other hand, when the curvature terms are ignored, the particle travels in straight lines with constant velocity (Fig. (a)a). To define the full collision avoidance RMP, we introduce a barrier-type potential to create extra repulsive forces, where . A comparison of the curvature effects in this setting is shown in Fig. (c)c and (d)d (with ). Next, we use RMPflow to combine the collision avoidance RMP above (with ) and an attractor RMP. Let be the goal. The attractor RMP is a GDS in the task space with a metric , a damping , and a potential that is zero at , where (see Appendix D.4). Fig. (e)e shows the trajectories of the combined RMP. The combined non-constant metrics generate a behavior that transitions smoothly towards the goal while heading away from the obstacle. When the curvature terms are ignored (for both RMPs), the trajectories oscillate near the obstacle. In practice, this can result in jittery behavior on manipulators. When the metric is not velocity-based () the behavior is less efficient in breaking free from the obstacle to go toward the goal.

simulated worlds real-world experiments
Figure 5: Two of the six simulated worlds in the reaching experiments (left), and the two physical dual-arm platforms in the full system experiment (right).

5.2 System Experiments

5.2.1 Reaching-through-clutter Experiments

We compare RMPflow with OSC, (i.e. potential fields (PF) with dynamics reshaping), denoted as PF-basic, and a variant, denoted PF-nonlinear, which scales the collision-avoidance weights nonlinearly as a function of obstacle proximity. We highlight the results here; Appendix E provides additional details, and the supplementary video shows footage of the trials. In both baselines, the collision-avoidance task spaces are specified by control points along the robot’s body (rather than the distance space used in RMPflow) with an isotropic metric (here for PF-basic and for PF-nonlinear, where is the max metric size used in RMPflow). The task-space policies of both variants follow GDSs, but without the curvature terms (see Appendix E).

Fig. 4 summarizes their performance. We measure time-to-goal, C-space path length (assessing economy of motion), achievable distance-to-goal (efficacy in solving the problem), collision intensity (percent time in collision given a collision), collision failures (percent trials with collisions). The isotropic metrics, across multiple settings, fail to match the speed and precision achieved by RMPflow. Higher-weight settings tend to have fewer collisions and better economy of motion, but at the expense of efficiency. Additionally, adding nonlinear weights as in PF-nonlinear does not seem to help. The decisive factor of RMPflow’s performance is rather its non-isotropic metric, which encodes directional importance around obstacles in combing policies.

5.2.2 System Integration for Real-Time Reactive Motion Generation

We present an integrated system for vision-driven dual arm manipulation on two robotic platforms, the ABB YuMi robot and the Rethink Baxter robot (Fig. 5) (see the supplementary video). Our system uses the real-time optimization-based tracking algorithm DART [34] to communicate with the RMP system, receiving prior information on robot configuration and sending tracking updates of world state. The system is tested in multiple real-world manipulation problems, like picking up trash in clutter, reactive manipulation of a cabinet with human perturbation, active lead-through (compliant guiding of the arms with world-aware collision controllers) and pick-and-place of objects into a drawer which the robot opens and closes. Please see Appendix F for the details of the experiments.

6 Conclusion

We propose an efficient policy synthesis framework, RMPflow, for generating policies with non-Euclidean behavior, including motion with velocity dependent metrics that are new to the literature. In design, RMPflow is implemented as a computational graph, which can geometrically consistently combine subtask policies into a global policy for the robot. In theory, we provide conditions for stability and show that RMPflow is intrinsically coordinate-free. In the experiments, we demonstrate that RMPflow can generate smooth and natural motion for various tasks, when proper subtask RMPs are specified. Future work is to further relax the requirement on the quality of designing subtask RMPs by introducing learning components into RMPflow for additional flexibility.

References

  • [1] Rimon, E., Koditschek, D.: The construction of analytic diffeomorphisms for exact robot navigation on star worlds. Transactions of the American Mathematical Society 327(1), 71–116 (1991)
  • [2] Ratliff, N., Toussaint, M., Schaal, S.: Understanding the geometry of workspace obstacles in motion optimization. In: IEEE International Conference on Robotics and Automation (ICRA) (2015)
  • [3] Ivan, V., Zarubin, D., Toussaint, M., Komura, T., Vijayakumar, S.: Topology-based representations for motion planning and generalization in dynamic environments with interactions. International Journal of Robotics Research (IJRR) 32(9-10), 1151–1163 (2013)
  • [4] Watterson, M., Liu, S., Sun, K., Smith, T., Kumar, V.: Trajectory optimization on manifolds with applications to SO(3) and R3XS2. In: Robotics: Science and Systems (RSS) (2018)
  • [5] Toussaint, M.: Robot trajectory optimization using approximate inference. In: ICML. pp. 1049–1056 (2009)
  • [6] LaValle, S.M.: Planning Algorithms. Cambridge University Press, Cambridge, U.K. (2006), available at http://planning.cs.uiuc.edu/
  • [7] Karaman, S., Frazzoli, E.: Sampling-based algorithms for optimal motion planning. International Journal of Robotics Research (IJRR) 30(7), 846–894 (2011), http://arxiv.org/abs/1105.1186
  • [8]

    Gammell, J.D., Srinivasa, S.S., Barfoot, T.D.: Batch Informed Trees (BIT*): Sampling-based optimal planning via the heuristically guided search of implicit random geometric graphs. In: IEEE International Conference on Robotics and Automation (ICRA) (2015)

  • [9] Mukadam, M., Dong, J., Yan, X., Dellaert, F., Boots, B.: Continuous-time Gaussian process motion planning via probabilistic inference. arXiv preprint arXiv:1707.07383 (2017)
  • [10] Khatib, O.: A unified approach for motion and force control of robot manipulators: The operational space formulation. IEEE Journal on Robotics and Automation 3(1), 43–53 (1987)
  • [11] Peters, J., Mistry, M., Udwadia, F.E., Nakanishi, J., Schaal, S.: A unifying framework for robot control with redundant DOFs. Autonomous Robots 1, 1–12 (2008)
  • [12] Udwadia, F.E.: A new perspective on the tracking control of nonlinear structural and mechanical systems. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 459(2035), 1783–1800 (2003), http://rspa.royalsocietypublishing.org/content/459/2035/1783
  • [13] Kappler, D., Meier, F., Issac, J., Mainprice, J., Garcia Cifuentes, C., Wüthrich, M., Berenz, V., Schaal, S., Ratliff, N., Bohg, J.: Real-time perception meets reactive motion generation. IEEE Robotics and Automation Letters 3(3), 1864–1871 (2018), https://arxiv.org/abs/1703.03512
  • [14] Mukadam, M., Cheng, C.A., Yan, X., Boots, B.: Approximately optimal continuous-time motion planning and control via probabilistic inference. In: IEEE International Conference on Robotics and Automation (ICRA) (2017)
  • [15] Bullo, F., Lewis, A.D.: Geometric control of mechanical systems: modeling, analysis, and design for simple mechanical control systems, vol. 49. Springer Science & Business Media (2004)
  • [16] Ratliff, N.D., Issac, J., Kappler, D., Birchfield, S., Fox, D.: Riemannian motion policies. arXiv preprint arXiv:1801.02854 (2018)
  • [17] Walker, M.W., Orin, D.E.: Efficient dynamic computer simulation of robotic mechanisms. Journal of Dynamic Systems, Measurement, and Control 104(3), 205–211 (1982)
  • [18] Albu-Schaffer, A., Hirzinger, G.: Cartesian impedance control techniques for torque controlled light-weight robots. In: IEEE International Conference on Robotics and Automation (ICRA). vol. 1, pp. 657–663 (2002)
  • [19] Sentis, L., Khatib, O.: A whole-body control framework for humanoids operating in human environments. In: IEEE International Conference on Robotics and Automation (ICRA). pp. 2641–2648 (2006)
  • [20] Lo, S.Y., Cheng, C.A., Huang, H.P.: Virtual impedance control for safe human-robot interaction. Journal of Intelligent & Robotic Systems 82(1), 3–19 (2016)
  • [21] Erez, T., Lowrey, K., Tassa, Y., Kumar, V., Kolev, S., Todorov, E.: An integrated system for real-time model-predictive control of humanoid robots. In: IEEE/RAS International Conference on Humanoid Robots (2013)
  • [22] Todorov, E.: Optimal control theory. In Bayesian Brain: Probabilistic Approaches to Neural Coding pp. 269–298 (2006)
  • [23] Liegeois, A.: Automatic supervisory control of the configuration and behaviour of multibody mechanisms. IEEE Transactions on Systems, Man and Cybernetics 7(12), 868–871 (1977)
  • [24] Ratliff, N., Zucker, M., Bagnell, J.A.D., Srinivasa, S.: CHOMP: Gradient optimization techniques for efficient motion planning. In: IEEE International Conference on Robotics and Automation (ICRA) (2009)
  • [25] Mukadam, M., Yan, X., Boots, B.: Gaussian process motion planning. In: IEEE Conference on Robotics and Automation (ICRA) (2016)
  • [26] Dong, J., Mukadam, M., Dellaert, F., Boots, B.: Motion planning as probabilistic inference using Gaussian processes and factor graphs. In: Robotics: Science and Systems (RSS) (2016)
  • [27] Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.: Operational space control: A theoretical and empirical comparison. International Journal of Robotics Research (IJRR) 6, 737–757 (2008)
  • [28] Platt, R., Abdallah, M.E., Wampler, C.W.: Multiple-priority impedance control. In: IEEE International Conference on Robotics and Automation (ICRA). pp. 6033–6038. Citeseer (2011)
  • [29] Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Computation 25(2), 328–373 (Feb 2013)
  • [30] Lewis, A.D.: The geometry of the maximum principle for affine connection control systems (2000)
  • [31] Khalil, H.K.: Noninear systems. Prentice-Hall, New Jersey 2(5), 5–1 (1996)
  • [32] Lee, J.M., Chow, B., Chu, S.C., Glickenstein, D., Guenther, C., Isenberg, J., Ivey, T., Knopf, D., Lu, P., Luo, F., et al.: Manifolds and differential geometry. Topology 643, 658 (2009)
  • [33] Featherstone, R.: Rigid Body Dynamics Algorithms. Springer (2008)
  • [34] Schmidt, T., Newcombe, R., Fox, D.: DART: Dense articulated real-time tracking with consumer depth cameras. Autonomous Robots 39(3) (2015)
  • [35] Taylor, J.R.: Classical Mechanics. University Science Books (2005)
  • [36] Udwadia, F.E., Kalaba, R.E.: Analytical Dynamics: A New Approach. Cambridge University Press (1996)

Appendices

Appendix A Geometric Dynamical Systems

Here we summarize details and properties of GDSs introduced in Section 4.1.

a.1 From Geometric Mechanics to GDSs

Our study of GDSs is motivated by geometric mechanics. Many formulations of mechanics exist, including Lagrangian mechanics [35] and the aforementioned Gauss’s Principle of Least Constraint [36]—–They are all equivalent, implicitly sharing same mathematical structure. In that sense, geometric mechanics, which models physical systems as geodesic flow on Riemannian manifolds, is the most explicit of these, revealing directly the underlying manifold structure and connecting to the broad mathematical tool set from Riemannian geometry. These connections enable us here to generalize beyond the previous simple mechanical systems studied in [15] to non-classical systems that more naturally describe robotic behaviors with non-Euclidean geometric properties.

a.2 Degenerate GDSs

Let us recall the definition of GDSs. Let and let and be differentiable. We say the tuple is a GDS if

(5)

where .

For degenerate cases, can be singular and (5) define rather a family of differential equations. Degenerate cases are not uncommon; for example, the leaf-node dynamics could have being only positive semidefinite. Having degenerate GDSs does not change the properties that we have proved, but one must be careful about whether differential equation satisfying (5) exist. For example, the existence is handled by the assumption on in Theorem 4.2 and the assumption on in Corollary 4.2. For RMPflow, we only need that at the root node is non-singular. In other words, the natural-form RMP created by pullback at the root node can be resolved in the canonical-form RMP for policy execution. A sufficient and yet practical condition is provided in Theorem 4.3.

a.3 Geodesic and Stability

For GDSs, they possess a natural conservation property of kinematic energy, i.e. it travels along a geodesic defined by when there is no external perturbations due to and . Note by definition may only be positive-semidefinite even when the system is non-degenerate; here we allow the geodesic to be defined for a degenerate metric, meaning a curve whose instant length measured by the (degenerate) metric is constant.

This geometric feature is an important tool to establish the stability of non-degenerate GDSs; We highlight this nice geometric property below, which is a corollary of Proposition 4.3. All non-degenerate GDSs in the form travel on geodesics. That is, , where . Note that this property also hold for degenerate GDSs provided that differential equations satisfying (5) exist.

a.4 Curvature Term and Coriolis Force

The curvature term in GDSs is highly related to the Coriolis force in the mechanics literature. This is not surprising, as from the analysis in Section 4.4 we know that comes from the Christoffel symbols of the asymmetric connection. Recall it is defined as

We show their relationship explicitly below. [] Let be the Christoffel symbol of the first kind with respect to , where the subscript denotes the element. Let and define . Then .

Proof of Lemma a.4.

Suppose . We can compare the two definitions and verify they are indeed equivalent:

Appendix B Proofs of RMPflow Analysis

b.1 Proof of Theorem 4.2

See 4.2

Proof of Theorem 4.2.

We will use the non-degeneracy assumption that (i.e. as we will show) is non-singular, so that the differential equation specified by an RMP in normal form or a (structured) GDS is unique. This assumption is made to simplify writing. At the end of the proof, we will show that this assumption only needs to be true at the root node of RMPflow.

The general case We first show the differential equation given by pullback is equivalent to the differential equation of pullback structured GDS . Under the non-degeneracy assumption, suppose factorizes as , where is some Jacobian matrix. On one hand, for pullback, because in the child node satisfies (where by definition ), the pullback operator combines the child nodes into the differential equation at the parent node,

(6)

where we recall is given by pullback. On the other hand, for with preserving , its dynamics satisfy

(7)

where is factorized by into