Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems

03/25/2020 ∙ by Matteo Saveriano, et al. ∙ DLR Leopold Franzens Universität Innsbruck Technische Universität München 0

Stable dynamical systems are a flexible tool to plan robotic motions in real-time. In the robotic literature, dynamical system motions are typically planned without considering possible limitations in the robot's workspace. This work presents a novel approach to learn workspace constraints from human demonstrations and to generate motion trajectories for the robot that lie in the constrained workspace. Training data are incrementally clustered into different linear subspaces and used to fit a low dimensional representation of each subspace. By considering the learned constraint subspaces as zeroing barrier functions, we are able to design a control input that keeps the system trajectory within the learned bounds. This control input is effectively combined with the original system dynamics preserving eventual asymptotic properties of the unconstrained system. Simulations and experiments on a real robot show the effectiveness of the proposed approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Modern robots are asked to perform complex tasks in unstructured environments and need to be endowed with a large set of pre-defined behaviors. Ideally, such behaviors are compactly represented, generate motion trajectories in real-time, and generalize to different execution contexts. Moreover, the list of behaviors should be easy to update. A trend in robotics suggests to use stable dynamical systems (DS) for planning of robotic skills [DMP, SEDS, NeuralLearn2, Clf, tau-SEDS, Perrin16, Blocher17]. DS can be eventually combined with Programming by Demonstrations (PbD) techniques [Billard_PbD] to intuitively acquire novel skills.

Among the others, an interesting property of stable DS is their capability to plan converging motions from any point of the state space. At run time, the initial robot position is fed into the DS and the converging reference trajectory is automatically generated. However, in constrained environments, planning with DS may fail because there is no guarantee that the generated motion satisfies the constraints.

This work focuses on motion planning with DS in a constrained workspace and, more specifically, in formally guaranteeing that generated robot trajectories do not violate the constraints (see Fig. 1). The idea is to consider the constraints on the robot motion as constraints on the state of the DS and to generate feasible trajectories that the low-level robot controller is able to track. We exploit a parameterized representation of the constraints and develop an approach to incrementally fit novel constraints from training data. In our framework, constraint representations are interpreted as zeroing barrier functions [Ames17, Wieland07, Xu15, Rauscher16]. Under the assumption that the constrained workspace is a convex set, we are able to design a control input that prevents the DS trajectories to violate the constraints. The control input is computed by solving an optimization problem trying to minimize the influence of the resulting controller on asymptotic properties of the unconstrained DS. The optimization problem admits an analytical solution that makes the approach computationally efficient and effectively usable in robotic applications.

Figure 1: Overview of the proposed approach. Kinesthetic teaching is used both to add new constraints and to remove previously learned ones.

The rest of the paper is organized as follows. Section II summarizes related work in the field. Section III provides preliminary knowledge on DS and barrier functions. The extension to multiple barrier functions is discussed in Sec. IV. An approach for incremental learning of barrier functions is described in Sec. V. Comparison with existing approaches and experiments on a real robot are presented in Sec. VI. Section VII concludes the paper.

Ii Related Work

Ii-a Motion planning with dynamical systems

One of the first examples of motion planning with DS, dynamic movement primitives (DMPs) [DMP] superimpose a non-linear forcing term to a linear dynamics to reproduce a demonstrated motion. The forcing term is suppressed by a time dependent signal to ensure convergence to the goal. DMPs can plan in joint or Cartesian space [Ude14, Saveriano19] and can be parameterized to customize the execution [TP-DMP, Pervez18].

A limitation of DMPs is that stability is retrieved by canceling the forcing term after a certain time. The stable estimator of dynamical systems

[SEDS] overcomes this limitation by using a non-linear and autonomous DS, learned from demonstrations, and constraining the DS parameters to ensure stability. Constraints adopted in [SEDS] prevent to accurately learn non-linear motions. This accuracy problem is faced in several works [NeuralLearn2, Clf, tau-SEDS, Perrin16]. For instance, [tau-SEDS, Perrin16] transform the dynamics in a diffeomorphed space where demonstrations are accurately reproduced. The work in [NeuralLearn2] derives a set of stability constraints from a learned Lyapunov function, while [Clf] uses the Lyapunov function to stabilize the DS. A common result among [NeuralLearn2, Clf, tau-SEDS, Perrin16] is that DS are able to plan stable motions with complex shapes.

Stable DS have several interesting properties that make them an effective tool for reactive motion planning. Complex dynamics are learned from demonstrations which greatly simplifies the DS design. DS generate converging motions from any point in the state space and the convergence is still preserved if the DS velocity is scaled with a positive scaling factor. DS trajectories can be modified to avoid possible collisions [DS_avoidance, Saveriano13, Saveriano14, Saveriano17, Hoffmann09] or to realize an incremental learning of motor skills [Nemec_15, Kulvicius13, Maeda17, Kronander15, Saveriano18, khoramshahi18]. DS parameters can encode kinematic and impedance behaviors [Calinon10, SaverianoURAI14]

and can be iteratively refined, e.g. via reinforcement learning

[Kormushev10, Buchli11].

Our work is complementary to these approaches. We use kinesthetic teaching to demonstrate the boundary of the admissible set. Given the constraints, the dynamics of a given DS—stable or unstable, manually designed or learned from demonstrations—are constrained within the admissible set while preserving eventual asymptotic properties of the DS.

Ii-B Constrained dynamics and motion planning

DS are typically used in a robot and workspace agnostic manner. However, robot motion is subject to constraints deriving from physical limitations (e.g. velocity limits) or restrictions in the workspace (e.g. obstacles). Our idea is to consider constraints on the robot motion as constraints on the state of the DS. In this way, the constrained DS generates a feasible trajectory that the low-level robot controller is able to accurately track. The states of the DS that comply with the constraints belong to the admissible set. Since DS trajectories may exit the admissible set, it is interesting to design a control input that renders the set forward invariant. In the control community, there are two main approaches for this, namely the invariance control and the barrier functions.

The invariance control [Wolff04, Wolff05, Scheint08, Kimmel14] renders the admissible set forward invariant with a discontinuous control. The work in [Zanchettin16] considers the robustness of the invariance control to uncertainties in the system dynamics or sensor measurements using quadratic programming. The problem of invariance control is that the computed control is discontinuous and may cause chattering. As shown in [Kimmel14], chattering can be reduced, but it is not guaranteed that chattering is avoided.

Barrier functions [Ames17, Wieland07, Xu15, Rauscher16] are capable of rendering the admissible set invariant with a smooth control action and therefore represent a suitable alternative to invariance control. In the literature, there are mainly two kinds of barrier functions. Reciprocal barrier functions are unbounded on the boundary of the admissible set, while zeroing barrier functions (ZBFs) vanish at the boundary. In this paper, we exploit ZBFs because: i) having an unbounded function is undesirable while designing a controller for real-time or embedded systems [Ames17] and ii) the controller derived from the ZBF formulation is robust to system and sensor uncertainties [Xu15]. An experimental comparison between reciprocal and zeroing barriers is presented in Sec. VI-A.

The problem of planning suitable robot motions can be alternatively seen as a path planning problem [Choset2005, Bowen17]. Global path planners are capable of discovering feasible paths even in complex scenarios with high-dimensional configuration spaces and non convex constraint sets, while guaranteeing completeness and asymptotic optimality of the plan [Bowen17]. On the contrary, the approach presented in this work applies to convex regions and Cartesian trajectories111The approach naturally extends to joint space only if joint constraints form a convex set—e.g. box constraints on the joint angles.. The drawback of path planners, compared with DS-based planners, is the higher computation time needed to plan the robot trajectory.

Iii Zeroing Barrier Functions

Iii-a Preliminaries

Consider the first-order DS , , where are the state and its time derivative respectively (position and velocity of the robot), is a locally Lipschitz non-linear function, and the output

is given by the vector of functions

. A solution of , namely , is called a trajectory. Different initial conditions generate different trajectories. Let us denote with the maximal interval of existence of . The DS is said to be forward complete if for any . A set is forward invariant if for every the trajectory for all . A point is an equilibrium point. An equilibrium is globally asymptotically stable if .

We focus on generating constrained and converging paths for the robot. A constraint is defined by a smooth scalar function . Using , the set of admissible states is defined as ; is the boundary and the interior of .

Iii-B Zeroing barrier functions

Definition 1

A continuous function is an extended class function for some if and is strictly increasing.

Definition 2

A smooth function is a zeroing barrier function (ZBF) for and for the admissible set , if there exist an extended class function and a set with such that, for all , , where the Lie derivative .

Note that is defined on a set larger than to consider the effects of model perturbations [Xu15]. In special cases, the condition on simplifies to , which is used in this work. The following lemma from [Ames17, Lemma 2] defines the conditions for the existence of .

Lemma 1

Assume that the compact admissible set is nonempty for a smooth function and consider the DS . If , , then for each , there exists a constant .

If a ZBF exists, then is forward invariant [Ames17, Proposit. 1].

Iii-C Zeroing control barrier functions (ZCBFs)

If the set is not forward invariant, it can be made forward invariant by designing a suitable controller . Let us consider the control affine system

(1)

where , , and are locally Lipschitz vector fields. For the controlled DS (1) it is possible to define a ZCBF.

Definition 3

A locally Lipschitz function is a zeroing control barrier function (ZCBF) for the DS (1) and the admissible set , if the Lie derivatives and are locally Lipschitz and there exists an extended class function such that

By applying to (1) any Lipschitz continuous control , , the set becomes forward invariant [Ames17, Corollary 2]. The set is non-empty if , meaning that has to be designed such that has a direct dependence on . This is the case for constraints of relative degree one [Isidori95].

Iv Motion Planning with Multiple ZCBF

In this section, we present an approach to enforce multiple constraints on a stable system used to plan robot motion. The DS is rendered globally asymptotically stable by a nominal control input . Hence, we derive an approach to merge the stabilizing controller with satisfaction of the constraints solving a quadratic program (QP). A similar problem is considered in [Rauscher16]. The differences are that we use zeroing instead of reciprocal barrier functions and that we constraint the planned motion instead of the robot dynamics.

Iv-a Multivariate ZCBF

Assume that constraints are specified for (1) via the smooth constraint functions . The set of admissible states that fulfill is defined as

(2)

which is the intersection of the sets associated with the constraints . The set is non-empty only if the individual constraints do not conflict. Assume that each is a ZCBF for the set . Any Lipschitz continuous control , with defined as in Sec. III-C, makes forward invariant. The set of admissible control values for is given by the intersection of all

(3)

where , and are tunable gains associated to each . If the set is non-empty, i.e. the constraints do not conflict, , and all state variables of the system (1) are controllable, the following result holds.

Theorem 1

Assume that the set defined by (2) is non-empty and that are the ZCBFs for the sets , for each . Then the set can be rendered forward invariant by applying any Lipschitz continuous controller to the system (1).

The set is non-empty and therefore a exists. If , it holds that . This is because the set is obtained by intersecting all the sets . Hence, renders each forward invariant [Ames17, Corollary 2]. Being the intersection of invariant sets an invariant set, renders forward invariant.

Iv-B Controller design

Theorem 1 allows to use any control input to render forward invariant. For states belonging to , where the constraints are not violated, the DS should be controlled by the stabilizing input to achieve desired performance. A possible way to combine these control objectives consists in solving the quadratic program

(4)

where the symbol indicates element-wise inequalities, and

(5)
Theorem 2

Consider (4) and the control affine system (1) subject to state constraints with associated zeroing control barrier functions . Assume that the rows of associated to active inequality constraints are linearly independent and that a locally Lipschitz control input is given for (1). If the set in (3) is non-empty, then the control law obtained by solving the optimization problem (4) is Lipschitz continuous and renders the set forward invariant.

Let us first prove that the solution of (4) is unique and Lipschitz continuous. The rows of associated to active inequality constraints are linearly independent by assumption. Moreover, it is assumed that the set is non-empty, which implies that . Recalling that each row of has the form , we conclude that the gradients of the active constraints are linearly independent. Hence, the linear independence constraint qualification is satisfied and the problem (4) admits a unique solution [Nocedal06]. The unique solution of (4) is Lipschitz continuous if , , and are Lipschitz continuous at [Morris13, Theorem 1]. is locally Lipschitz by assumption. and are Lipschitz continuous for all because , , and are locally Lipschitz (see Definition 3). Hence, the solution of (4) is unique and Lipschitz continuous for all .

Given that is Lipschitz continuous and that the constraints in (4) force to lie in , the set is forward invariant according to Theorem 1.

Remark 1

Theorem 2 can be eventually proved using the weaker Mangasarian–Fromovitz constraint qualification [Nocedal06]. The proof is analogous to [Rauscher16, Theorem 1], but considering ZCBFs instead of reciprocal control barrier functions.

Remark 2

The control input computed by solving (4) renders forward invariant even if the DS (1) is unstable. Asymptotic properties of the unconstrained DS are preserved if their objectives do not conflict with the forward invariance. For instance, the convergence to an equilibrium is preserved if is an admissible state, i.e. .

Remark 3

The analytic solution of (4) is , where contains only the rows of and the vector only the elements of corresponding to active constrains.

Iv-C Simulation results

We present some results on simulated DS to show the capabilities of the approach for constrained motion generation with multiple ZCBF and to clarify some important aspects of Theorem 2. We consider planar dynamical systems, namely a stable DS obtained with the approach in [SEDS] (Fig. 2(a)), the unstable DS (Fig. 2(b)), and the stable limit cycle in Fig. 2(c)-(d) defined as , . The nominal control input is and the matrix for all the considered systems. The set of admissible states is the intersection of linear constraint functions in the form . Each DS has an equilibrium point (stable for the first, unstable for the others) at , that is an admissible state by construction. The -th constraint is active if . We set . The control input that renders forward invariant is computed as in Remark 3.

Results in Fig. 2 show that, in all the considered cases, the control input renders the set forward invariant. This is an expected result since the assumptions of Theorem 2 are matched. In Fig. 2(a), we notice that all the trajectories asymptotically converge to the stable equilibrium . Indeed, being an admissible set, Remark 2 holds. Intuitively, we can say that if the unconstrained system is stable or stabilized through control at , moving along the frontier of sooner or later the DS generates a flow (velocity) pointing towards the interior of where only the stabilizing input is active. Then, the DS is driven towards the equilibrium point. Figure 2(b) shows that is forward invariant also for originally unstable dynamics (Remark 2). In this case, we do not expect that the control input guarantees asymptotic convergence to a unique equilibrium, but we know that trajectories starting in are bounded to . Finally, Theorem 2 does not guarantee the existence of periodic orbits within , even if the original DS has a limit cycle trajectory (Fig. 2(c)-(d)). Depending on the shape of the constraints and the limit cycle dynamics, the constrained system may exhibit a periodic trajectory within (Fig. 2(c)) or not (Fig. 2(d)).

Figure 2: Results obtained by constraining the dynamics of three different systems within a convex region. The four barrier functions are: , , , and .

V Incremental Learning of Barrier Functions

V-a Parametric ZCBF

As discussed in Sec. IV, zeroing control barrier functions have to match several requirements. In order to define a suitable parametric form for a ZCBF, we assume that the constraints are represented with a linear function

(6)

where the normal vector and the scalar are learnable parameters. In practice, a linear function as that in (6) represents a straight line if or a (hyper-)plane if . Under mild assumptions, the linear constraint function (6) represents a ZCBF. Indeed, it is easy to verify that has the following properties:

  1. is Lipschitz continuous.

  2. The gradient of , namely , is constant and therefore Lipschitz continuous.

  3. The Lie derivatives and are Lipschitz continuous since and are Lipschitz continuous by assumption.

  4. if has full column rank.

  5. The gradients of two active constraints and are linearly dependent iff , i.e. the normal vectors and are (anti-)parallel.

A set of admissible states , defined as the intersection of linear constraints (6) satisfying properties 1)–5), can be rendered forward invariant according to Theorem 2. Moreover, properties 1) and 2) always hold, while properties 3) and 4) depend on the given DS. Hence, an algorithm that learns the constraints has to simply check that the normal of the active constraints are not aligned for property 5) to hold.

V-B Incremental learning of ZCBFs

We aim at simultaneously clustering the data representing state constraints into multiple subspaces and finding a low-dimensional embedding (the linear ZBCF) for each cluster of points. This problem is known as subspace clustering [Vidal11]. An algorithm for clustering point clouds into planar subregions is proposed in [Donnarumma12]. This approach assumes that hundreds of points are available for each frame. The set of points is clustered into different planes based on the point to plane distances. In our setting, one point (robot position) is available for each time step. Therefore, we propose an approach simpler and faster than [Donnarumma12] that is well-suited for online applications. The proposed clustering algorithm is described as follows and summarized in Algorithm 1.

0.  , , , // training point at time , matrix of unlabeled points, set of learned ZCBF, threshold
1.  if  is empty then
2.     Add an empty column to and set it equal to
3.  else
4.     if  and  then
5.         lies on (eventually refine and )
6.     else if  for any  then
7.        Remove from // User pushed the robot beyond the barrier
8.     else if  then
9.        Add to // The new point belongs to a new cluster
10.     end if
11.  end if
12.  if  contains points then
13.     Fit , add to , empty
14.  end if
14.  ,
Algorithm 1 Incremental learning of multiple ZCBFs

In our setting, a new training point arrives for each time step . Assume that at current time no barrier has been learned yet. In this case, a new column is added to the matrix and set equal to the training point (lines in Algorithm 1). The algorithm waits for the next point. As soon as the matrix contains points, where is the dimension of the state space, it is possible to fit a linear ZCBF in the form of

. To this end, we perform a principal component analysis on

and set the normal vector and the scalar , where is the -th principal component of and is the centroid of the points in . To ensure that the goal is an admissible point, we check the sign of and set and if (from with and ). The barrier function defines a set of admissible states .

Let us now assume that the barrier function has been learned. At time step a new training point arrives. We consider three possible cases: i) lies on the plane described by , ii) lies outside , and iii) lies inside but not on . The three cases can be discriminated considering that , with , is the signed distance between and the plane described by . Looking at the value of , we have three cases:

  1. implies that lies on the plane described by . We can eventually refine and by using incremental principal component analysis as in [Donnarumma12].

  2. implies that is outside and is a non-admissible state. Being forward invariant, indicates that an external disturbance (the user) is pushing the robot beyond the barrier. We use this interaction modality to remove .

  3. implies that is an admissible point () that does not lie on the plane described by . It is stored in the matrix .

In case iii), the point is admissible but distant from . This is interpreted as the intention of the user to show a different barrier and is stored in . If contains points, we fit a new barrier function with associated set of admissible states . Otherwise, the algorithm waits for the next point. As discussed in Sec. IV-A, the new admissible set is the intersection of and . The described procedure is repeated for every new training point. Note that case i) avoids that two active constraints have their normals aligned. Hence, proposition 5) in Sec. V-A is satisfied and the learned set of admissible states can be rendered forward invariant.

The algorithm generalizes to more than two barriers. Given barriers and a new point , we check if: i) is admissible and belongs to one of the (line in Algorithm 1), ii) is not admissible, i.e. at least for one (line ), or iii) is admissible and far from all the barriers , (line ). Note that the threshold is used in Algorithm 1 to discriminate between the three cases. The threshold is needed in real cases to cope with possible uncertainties like noise in the measurements or imperfect robot control. The value of is determined considering that represents the distance between and the -th barrier. The threshold also triggers the creation of a new barrier if points are distant from existing barriers. In this way, the boundary of a non-linear domain is approximated with a set of linear functions (Fig. 3(a)). Finally, the algorithm works properly if consecutive training points represent the surface of the boundary. Otherwise, the barrier may deviate from the boundary as in Fig. 3(b). In Sec. VI-B, kinestetic teaching is used to collect proper training data.

Figure 3: (a) A non-linear boundary represented with several linear barriers. (b) Barriers learned with two different sets of points.

Vi Experimental Results

Vi-a Comparative results

The approach presented in Sec. IV renders a convex set forward invariant. The same problem can be solved with other techniques like Reciprocal Control Barrier Functions (RCBF) [Rauscher16] and Constrained Optimization (CO). Hence, we present an experimental comparison to underline similarities and differences among the three approaches. We consider the SEDS system used in Fig. 2(a) and the three barrier functions , , and .

ZCBF and constrained optimization (CO)

In the considered scenario, it holds that (stable DS) and . The barriers form the convex set with and . To render forward invariant, we compute a control input by solving

This CO problem is numerically solved in Matlab using the interior-point algorithm [Nocedal06]. As shown in Fig. 4(a), ZCBF and CO work comparably well for trajectories starting within the admissible set, guaranteeing forward invariance and convergence to the goal. However, in the considered 2D setup, CO takes about s to plan the next position while ZCBF, for which an analytic solution exists, takes less than a millisecond (ms). Results in Fig. 4(b) show that both approaches drive trajectories starting outside within the set. However, the controller obtained with CO is more aggressive and pushes the state within in one step with consequent high velocity. Although it is possible to consider bounds on the control input (), this will make the problem harder and increase the computation time. On the contrary, the controller obtained with ZCBF smoothly drives the state within , which is preferable in real applications.

Figure 4: Comparison of ZCBF with (a)–(b) Constrained Optimization (CO) and (c)–(d) Reciprocal Control Barrier Function (RCBF) approaches.

ZCBF and RCBF

For RCBF, we use the common barrier [Rauscher16, Ames17] that diverges for and is undefined for . Results in Fig. 4(c) show that the two approaches have a similar behavior for trajectories starting within the admissible set, guaranteeing forward invariance of and convergence to the goal. They also perform similarly in terms of computation time because also RCBF admits an analytic solution. The problem with RCBF appears when the trajectory starts outside the admissible set, where at least one of the is undefined. In Fig. 4(d), both and are undefined because . This causes the trajectory to diverge from the admissible set. In real applications, uncertainties due to noise or imperfect control may drive the trajectory slightly outside the admissible set. In this cases, ZCBF smoothly drives the trajectory within , showing its robustness to uncertainties.

Figure 5: Incremental learning of zeroing barrier functions. (a) Snapshots of the incremental learning process. (b) The robot position ( and over time) during the experiment. Dashed lines indicate kinesthetic teaching phases, solid lines represent the robot motion without external disturbances. (c) A comparison of the desired (grey dashed line) and executed (solid black line) trajectory in the unconstrained case. (d) The robot position in the - plane during the experiment. Dashed lines indicate kinesthetic teaching phases, solid lines represent the robot motion after the teaching, and solid grey lines represent the robot motion before the teaching. Note that only and coordinates are shown because the learning process has no effect on .

Vi-B Robot experiments

Setup and data collection

The proposed approach is tested on a KUKA LWR IV+ manipulator controlled at Hz through the fast research interface (FRI) [FRI]. The robot position is controlled with a Cartesian impedance controller, while the orientation is kept fixed. The stiffness matrix is , where to have a relatively stiff robot that accurately tracks the desired trajectory (see Fig. 5(c)). External torques applied to the robot during the teaching are estimated by the FRI. In case a non-zero torque is detected, the measured position in each time is commanded to the robot. This allows the user to distract the robot from the DS trajectory and to start collecting training data. When the teaching ends (no external torque), the DS generates on-line a new trajectory starting from the current robot position. As shown in Fig. 5(b), with this procedure the robot smoothly switches between teaching and execution phases. Training data are sampled at Hz, i.e. we store one position every samples, to eliminate points that are too close to each other. Alternatively, one can collect only points at a certain distance. Algorithm 1 is used to add/remove barriers. At run-time, the DS state is constrained by the computed control to remain within the admissible set.

Results

The results of the proposed approach are shown in Fig. 5. The desired robot trajectory in the - plane is generated with the SEDS system used in Sec. IV-C (Fig. 2(a)), while the motion is generated with the linear DS . The robot loops between the two goal positions and , depicted as black dots in Fig. 5(c) and (d), following a periodic motion. This is obtained by switching the desired goal after reaching the previous one. The top row of Fig. 5 shows snapshots of the interactive learning procedure. Kinesthetic teaching is effectively used to add new constraints or to remove existing ones, while the control based on the barrier functions renders the admissible set forward invariant generating a smooth constrained trajectory (middle and bottom rows of Fig. 5). As stated in Remark 2, being the goals and admissible states, the constrained motion converges to the desired goal.

Vi-C Discussion

Presented experiments show the effectiveness of our approach. The user demonstrates position constraints via kinesthetic teaching and linear barrier functions are incrementally learned. Moreover, the user can cancel a learned constraint by pushing the robot beyond the relative barrier. This is useful, for example, to cope with an erroneous demonstration, or to adapt the motion to a new scenario with different bounds.

In this paper, we use the DS in the so-called open loop configuration where the DS is initialized with the robot state and the resulting motion is sent to the robot as a reference trajectory to track. Alternatively, one can use the DS in closed loop configuration by continuously feeding the measured state into the DS. This allows a continuous adaptation of the motion to external perturbations, but requires a customized controller to ensure stability or passivity of the closed loop system [Kronander16]. The presented approach does not directly apply to the closed loop configuration because the control input would not render the admissible set forward invariant. Indeed, the robot dynamics are not considered in which may cause constraint violations. For the closed loop configuration, one has to directly modify the robot controller to ensure forward invariance of a given set. Although it is possible to constrain the robot controller [Kimmel14, Zanchettin16], the passivity of the closed loop system is not guaranteed. For this reason, we adopt the open loop configuration and leave the extension to closed loop configuration as a future work.

The ZCBF formulation in Sec. IV is valid for linear and non-linear ZCFBs. In this work, we used planar ZCBFs for two reasons: i) it is straightforward to satisfy the requirements of Theorem 2, and ii) training data are efficiently and incrementally clustered into planar regions. A limitation of the planar ZCBFs is shown in Fig. 6. To preserve the stability, the equilibrium point has to lie in the admissible set (Remark 2). This assumption can be violated if one wants to avoid a closed region like an obstacle. As shown in Fig. 6(b), the goal can be non-admissible and the motion stops at the boundary of the admissible set. Another possibility is that the starting point is non-admissible (Fig. 6(c)). In this case, the control input drives the state within the admissible set but the trajectory can enter the white area. To overcome this limitation, one can use a non-linear ZCBF and create a concave admissible set that contains starting and goal points but not the white region. The possibility of planning within concave sets will also make possible to extend the approach to the joint space, where obstacles have typically non-convex shapes. Extending the approach to non-linear functions with a concave admissible set is left as future work.

Figure 6: (a) The admissible set contains both the starting position and the goal. Trying to avoid a convex region (white area) the goal (square) (b) or the initial position (dot) (c) becomes a non-admissible state.

Vii Conclusions

We presented an approach to learn task space constraints from human demonstrations and to exploit the learned constraints to plan bounded trajectories for robotic manipulators. The approach is incremental and works on-line. Training data are firstly clustered in an unsupervised manner. Points belonging to newly discovered clusters are used to fit planar barrier functions representing the constraints. A previously learned constraint can be easily deleted by pushing the robot beyond the relative barrier. Learned barrier functions are then used to constrain the state of the dynamical system used to plan robot trajectories. We presented an optimization-based approach to compute a smooth control input that forces the DS trajectories to remain within the learned bounds and preserves eventual stability properties. Simulations and experiments show that our approach is effective in learning task space constraints and planning feasible motions.

As a future work, we plan to automatically generate training samples using surface-tracking techniques and to consider second-order DS to learn velocity barriers.

References