I Introduction
Many robotic manipulation tasks are sensitive to small changes in execution parameters like target positions or stiffness. Also, applications usually benefit significantly from improved execution time or reliability. However, it usually is a challenge to optimize such tasks due to their sequential nature. To handle such a challenge, Signal Temporal Logic (STL) which is a formalism to describe the temporal characteristics of trajectories is wellsuited. It provides a realvalued function, called robustness metric generated from a logical specification which can be used to evaluate the performance of the robotic tasks.
In this work, we investigate the suitability of different robustness metrics for the purpose of optimizing robotic manipulation tasks. There have been efforts towards applying STL in various robotic environments. [2, 1] develop control techniques for multiagent systems using STL specification. [3] has used STL rewards in Learning from Demonstration (LfD) on a 2D driving scenario. However, most works focus on directly optimizing the controller, which can be limiting for complex systems. Instead, we propose to use classical blackbox optimization to improve existing tasks.
Ii STL Robustness Metric
Formally, an STL specification can be understood as follows. Consider a discrete time sequence . The STL formula is defined using the predicate that is of the form , where is the state of the signal at time and maps each time point to the realvalue . The STL syntax is defined as: . Where is the set of all such that ; . The operators refer to Boolean negation, conjunction and disjunction operators respectively. The temporal operators refer to globally, eventually and until operators, respectively. A complete definition of STL can be found in [11],[10].
The robustness metric denoted as is the quantitative semantics of the STL formula that measures ”how well” the signal is fulfilled at time . The classical way of defining this semantics is space robustness [8]. This measure is positive if and only if the signal satisfies the specification (Soundness property) and the closer the robustness is to zero, the smaller are the required changes of signal values to change the truth value. Formally, space robustness and its corresponding operators are defined as follows.
(1)  
The remaining operators can be derived using (1). Although this is an intuitive and practical way to determine robustness, it has limitations, particularly due to the min and max functions. They are nonsmooth and nondifferentiable, making it more difficult for any optimizer to find a good solution. To better address this issue, we discuss other alternatives to space robustness and some of the properties.
Iia Robustness types
To address the above issues of space robustness, several alternative definitions have been proposed in recent years.
IiA1 Time Robustness
Instead of looking at how well individual signal values satisfy the specification, time robustness shifts the signal in time to quantify satisfaction [6]. However, time robustness is discontinuous at rising and falling edges due to the switching from positive to negative values. This kind of robustness is useful in cases where it is important to find a signal based on how fast/slow they should satisfy the specification.
IiA2 LSE Robustness
To deal with the smoothness problem of space robustness, [5] proposed a robustness formulation based on the LogSumExponential (LSE) approximation of the max and min operators. LSE corresponds to the log of a summation of exponential terms with some scalar multiplier as scaling factor. This approximation is smooth due to infinitely differentiable approximation and can reduce the influence of spikes in the signal, but it comes with a cost of loosing soundness property.
IiA3 AGM Robustness
The Arithmetic Geometric Mean (AGM) is an averagebased robustness that modifies all STL operators at all time instances [10]. This definition captures how fast the signal
satisfies the specification by computing arithmetic and geometric means. This normalizing approach can be useful when the signals are of different units. However, AGM does not guarantee convergence with gradientbased optimization techniques
[9].IiA4 Smooth Robustness
It can be seen as a combination of LSE and AGM robustness metrics [9] because, unlike them, it is both sound and guarantees convergence. The use case is similar to LSE with additional feature of giving positive outcome only if the signal satisfies the specification.
IiA5 Avg Robustness
It is a combination of space and time robustness. It captures how soon or late a signal meets the specification [11] and therefore is suitable to optimize both accuracy and speed in achieving a task. The Avg Robustness does not support nested temporal operators, which can be a drawback.
IiA6 NEW Robustness
It is introduced recently in [7] which is based on a scaleinvariant behaviour. The metric is computed by taking weighted average of the effective measures where the weight is defined such that it becomes traditional space robustness when it approaches infinity. The test results from the authors show better performance comparing to AGM robustness in terms of finding feasible solutions.
IiB Metric Properties
[7]
summarizes a number of properties that help to classify robustness metrics. Since most of these properties are especially relevant for optimization, we summarize them here for all considered robustness definitions in Table
I for the following property definitions. A metric is sound if if and only if the signal satisfies the specification at time . A metric is weakly smooth if it is continuous everywhere and its gradient is continuous for all points. A metric is said to converge if it guaranteed to reach a local maximum with gradientbased optimization. A metric is shadowlifting if the metric increases when making partial progress towards the task specification. A metric is scaleinvariant if for any .Robustness Metrics  

Properties  Space  LSE  Smooth  AGM  Avg  NEW 
Weakly Smooth  No  Yes  Yes  Yes  No  Yes 
Sound  Yes  No  Yes  Yes  Yes  Yes 
Converge  No  Yes  Yes  No  No  Yes 
Shadowlifting  No  No  No  Yes  Yes  Yes 
Scaleinvariance  Yes  Yes  Yes  No  Yes  Yes 
Iii Learning from STL
In this section, we discuss on the approach of using STL for optimizing tasks and present STL specifications to obtain a desired behavior. Specifically, we focus on optimizing task execution duration and final position of the robot endeffector using CMAES and Bayesian optimization (BO). The optimizer gets rewards based on the STL specifications at the end of each task execution and generates different parameters for the next run to finally produce a learned trajectory that fulfills the STL constraints.
Let us consider the following desired behavior: “The endeffector has to eventually visit three regions within the time intervals to , to , and to seconds, respectively”. Equation (2) represents this as an STL specification.
(2)  
These specifications show the potential of STL for defining temporal constraints for task optimization. In general, we obtain a lower reward when the behavior fails the constraints soon. The rewards obtained using such specifications can push the optimizer to satisfy all the conditions even when those constraints are in different time instants. This specification is simple to change to achieve different desired outcomes.
Iv Experiments and Results
We conduct experiments on the 7DOF Panda Robot in a simulated environment. The number of evaluations per experiment is , each one taking minutes approximately. Time taken for reward computation and optimization is fast. The robot motion profile has three point to point trajectories. The objective here is to optimize duration parameters and endeffector position parameters at each trajectory which totals to 9 parameters. The robot resets to its initial position at the end of each evaluation.
The trace of the endeffector in Fig. 1 shows trajectory positions converging to the goal regions. Figure 2 displays the obtained rewards, which increase when the trajectories get closer to the desired regions. Figure 3 shows the BO surrogate model, i.e. a Gaussian Process (GP), after evaluating iterations. It is clear from Fig. 4 that the optimization is heavily affected by the robustness metric used. Table II shows the performance of all the metrics using BO and CMAES optimizers. The Success Rate (SR) is computed as the number of times the robot satisfies all the constraints over epochs. The Task Satisfaction (TS) is the first evaluation step when all the STL constraints were satisfied. Some experiments are run until convergence. Smooth Robustness performs better with BO while NEW robustness performs better using CMAES.
Based on observations from the experiments, optimization with the space robustness metric does not always satisfy the constraints when there are more temporal operators. LSE and Smooth robustness performance is influenced by their respective scaling factors and has to be tuned for different problems. The convergence with AGM metric is not guaranteed unless signals are normalised, but this is not straightforward in manipulators due to the complexity of obtaining workspace boundaries. The NEW robustness metric is suitable for defining maniputator tasks as they are robust in finding a solution all the time while Smooth robustness can converge faster given the scaling factors are tuned.
Robustness Metrics  

Type  Space  LSE  Smooth  Avg  AGM  NEW  

SR  12.71%  10.0%  29.9%  9.92%  20.0%  27.41%  
TS  75  68  33  85  39  51  

SR  1.67%  5.0%  3.51%  0.0%  3.7%  6.67%  
TS  58  48  32  Fail  50  29 
V Conclusion
In this paper, we exploited STLbased constraints as cost functions to optimize simple robotic manipulation tasks. We analyzed several STLbased cost functions and showed their influence on optimizing simple robot trajectories in multipletarget reaching tasks. With several experiments, it is possible to see Smooth and NEW robustness are suitable with classical blackbox optimizers such as BO and CMAES. Further, this work can be extended by considering orientation parameters and nested temporal STL specifications on other optimizers.
References
 [1] Lindemann, L et al., (2020). Barrier function based collaborative control of multiple robots under signal temporal logic tasks. IEEE TCNS, 7(4), 1916–1928.
 [2] Gundana, D., and KressGazit, H. (2021). Eventbased signal temporal logic synthesis for single and multirobot tasks. IEEE RAL, 6(2), 3687–3694.
 [3] Puranic et al., (2021). Learning from Demonstrations Using Signal Temporal Logic in Stochastic and Continuous Domains. IEEE RAL.
 [4] Brochu et al., (2010). A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. ArXiv Preprint:1012.2599.
 [5] Pant, Y et al., Smooth operator: Control using the smooth robustness of temporal logic. 2017 IEEE CCTA. pp. 12351240 (2017)
 [6] Donzé, A. & Maler, O. Robust satisfaction of temporal logic over realvalued signals. ICFMATS. pp. 92106 (2010)
 [7] Varnai, P. & Dimarogonas, D. On robustness metrics for learning STL tasks. 2020 ACC. pp. 53945399 (2020)
 [8] Belta, C. & Sadraddini, S. Formal methods for control synthesis: An optimization perspective. Annual Review Of CRAS. 2 pp. 115140 (2019)
 [9] Gilpin et al., A smooth robustness measure of signal temporal logic for symbolic control. IEEE CSL. 5, 241246 (2020)
 [10] Mehdipour et al., Arithmeticgeometric mean robustness for control from signal temporal logic specifications. 2019 ACC. pp. 16901695 (2019)
 [11] Aksaray et al., Qlearning for robust satisfaction of signal temporal logic specifications. 2016 IEEE CDC. pp. 65656570 (2016)
Comments
There are no comments yet.