Testing Robot System Safety by creating Hazardous Human Worker Behavior in Simulation

by   Tom P. Huck, et al.

We introduce a novel simulation-based approach to identify hazards that result from unexpected worker behavior in human-robot collaboration. Simulation-based safety testing must take into account the fact that human behavior is variable and that human error can occur. When only the expected worker behavior is simulated, critical hazards can remain undiscovered. On the other hand, simulating all possible worker behaviors is computationally infeasible. This raises the problem of how to find interesting (i.e., potentially hazardous) worker behaviors given a limited number of simulation runs. We frame this as a search problem in the space of possible worker behaviors. Because this search space can get quite complex, we introduce the following measures: (1) Search space restriction based on workflow-constraints, (2) prioritization of behaviors based on how far they deviate from the nominal behavior, and (3) the use of a risk metric to guide the search towards high-risk behaviors which are more likely to expose hazards. We demonstrate the approach in a collaborative workflow scenario that involves a human worker, a robot arm, and a mobile robot.



There are no comments yet.


page 1

page 4

page 7


Simulation-based Testing for Early Safety-Validation of Robot Systems

Industrial human-robot collaborative systems must be validated thoroughl...

An Optimization Approach for a Robust and Flexible Control in Collaborative Applications

In Human-Robot Collaboration, the robot operates in a highly dynamic env...

On the Importance of Environments in Human-Robot Coordination

When studying robots collaborating with humans, much of the focus has be...

A Safety-Aware Kinodynamic Architecture for Human-Robot Collaboration

The new paradigm of human-robot collaboration has led to the creation of...

From exploration to control: learning object manipulation skills through novelty search and local adaptation

Programming a robot to deal with open-ended tasks remains a challenge, i...

Building Mental Models through Preview of Autopilot Behaviors

Effective human-vehicle collaboration requires an appropriate un-derstan...

Diversity-based Design Assist for Large Legged Robots

We combine MAP-Elites and highly parallelisable simulation to explore th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Hazard analysis is a major challenge in human-robot collaboration (HRC). As HRC is increasingly deployed in industrial practice, effective methods are needed to ensure safety of human workers. While traditional robot systems maintain safety through spatial separation, HRC requires more sophisticated measures, like adaptive speed- and workspace limitations, area monitoring (e.g. by laser scanners and light curtains), collision detection, and more [STD_ISOTS15066]. These measures must be configured appropriately with respect to workflow and robot properties like stopping time and collision potential. Furthermore, the measures must ensure safety not only for expected worker behaviors, but also for unforeseen or erroneous behaviors. To ensure that these requirements are met, a hazard analysis is performed before commissioning [STD_ISO10218, STD_ISO12100].

Fig. 1: Worker behavior is modeled with action sequences and parameters . Hazardous behaviors are found by: (1) Building a set of feasible action sequences (Generation), (2) simulating sequences with nominal parameters

to get an initial risk estimate

(Exploration), and (3) using optimization to find motion parameters that further increase risk (Exploitation).

Current hazard analyses are mostly based on human reasoning, expert knowledge, and experience. In recent years, new approaches including formal verification[RA_Askarpour2016, RA_Vicentini2019] and rule-based expert systems [RA_Awad2017, RA_Wigand2020] have been proposed. While these approaches are viable, they typically operate on rather abstract models. Thus, their effectiveness is limited for systems that are complex or require fine-grained analysis. In such cases, simulation-based testing [RA_Araiza2016, RA_Huck2020] is useful, as it can represent certain effects (e.g. collisions) more accurately. However, simulation of HRC requires simulating the human worker, which is challenging. Given the variability of human behavior and the possibility of human error, it is insufficient to simulate only the expected worker behavior. Instead, effective simulation-based tests should also cover deviating behaviors to see if they create unsafe states. In complex systems, it may not be immediately obvious what types of behaviors are potentially hazardous. Furthermore, as worker behavior is complex and the number of simulation runs is limited by the computational budget, exhaustive simulation of all possible behaviors is infeasible. This leads to the search problem of finding hazardous worker behaviors given a limited number of simulation runs. We address this problem as follows:
1) Rationality checks: We restrict the search space by rejecting action sequences that violate workflow-constraints.
2) Prioritization: If the remaining sequences are still too many for exhaustive simulation, we prioritize sequences closer to the nominal behavior based on a distance metric.
3) Risk-guided search: We use a risk metric to guide the search towards high-risk behavior, which increases the chances of uncovering hazards.
Note that this paper is not concerned with the development of digital human models. Instead, we assume a human model as given and focus on the aforementioned search problem.
Fig. 1 shows the general outline of our approach, which is structured in three phases: In the generation phase, a preprocessing script builds a set of feasible sequences using workflow constraints that are modeled in a finite state machine (FSM). The sequences are fed into the 3D-simulator CoppeliaSim111https://www.coppeliarobotics.com/ to search for the occurence of hazardous states. The search itself is structured in an exploration-phase, where initial risk estimates for the sequences are obtained, and an exploitation phase that attempts to find especially high-risk executions of the sequences (details in Sec. IV and V).

Ii Background and Related Work

Ii-a Robot Safety and Hazard Analysis in Industrial Practice

Safety requirements for industrial robot systems are specified by ISO/TS 15066 [STD_ISOTS15066] and ISO 10218 [STD_ISO10218] (the former is intended specifically for HRC). Depending on the robot’s capabilities and mode of operation, safety can be achieved through different measures like protective stops, speed and separation monitoring, collision force limitation, or combinations thereof. To implement these measures, collaborative robots (e.g., the KUKA LBR iiwa or the Universal Robots UR10e) typically provide a set of safety functions such as software-based workspace- and speed-limitations which are programmed and parameterized by users or system integrators according to their specific needs. Additionally, external safety sensors (e.g. light curtains, laser scanners, camera systems) are often used to monitor robot vicinity for approaching humans [HRC_Wang2020]. Choice and configuration of safety measures are highly use-case dependent and influenced by various system-specific factors (e.g., cell layout, reachability of hazardous areas, robot stopping time/distance etc.) [STD_ISO13855]. To ensure that all relevant hazards are properly addressed, HRC systems are subjected to a hazard analysis222Note that there are also other terms for this, e.g. ”hazard and risk analysis” or ”risk assessment”. Their meanings differ slightly. For simplicitly we only use the term ”hazard analysis”. prior to commissioning [STD_ISO10218]. This is a multi-step procedure to identify risks, estimate and evaluate potential hazards, and determine appropriate safety measures. Standards explicitly require that the hazard analysis does not only consider the nominal workflow, but also erroneous worker behavior [STD_ISO12100]. In current practice, hazard analysis is largely based on human reasoning, expert knowledge, experience, and simple tools (e.g., checklists) [STD_ISO12100, STD_ISO14121].

Ii-B Novel Hazard Analysis Methods

Current hazard analysis methods are not well-suited for complex (HRC) systems. Thus, a variety of novel methods have been proposed in scientific literature. We shortly introduce the most prominent approaches. For a comprehensive review, we refer to [RA_Huck2021b].

Ii-B1 Semi-formal Methods

These methods use semi-formal (often graphic) system representations such as control structure diagrams [RA_Leveson2012], UML diagrams [RA_Guiochet2013] or task ontologies [RA_Marvel2014] to help the user analyze system behavior and identify potential hazards. Often, they also use a set of guide words to guide the user through the procedure [RA_Leveson2012, STD_IEC61882]. While these methods provide a certain level of support, they have a limited potential to be automated, since core aspects of the hazard analysis are still based on human reasoning.

Ii-B2 Formal Verification

Formal verification methods automatically check a system model in a formal language for possible violation of safety criteria, allowing for a certain extent of automation in the hazard analysis. In industrial HRC, this approach is mainly pursued by the SAFER-HRC-method, which is based on the formal language TRIO and the satisfiability checker ZOT [RA_Askarpour2016]. The method has also been extended to cover erroneous worker behavior [HUM_Askarpour2017] and can be used in conjunction with a 3D simulator [RA_Askarpour2020] for visualization. Besides industrial HRC, formal safety verification has also been applied to robot systems for personal assistance, medicine, and transport [RA_Webster2014, RA_Choi2021, RA_Proetzsch2007].

Ii-B3 Rule-based Expert Systems

Rule-based expert systems also provide a certain degree of automation. They use a domain-specific model, such as the Product Process Resource model (PPR) [MISC_Ferrer2015], to describe safety-related workspace characteristics (e.g. layout, which tools are used, and which tasks are performed). Pre-defined rule-sets map these workspace characteristics onto a set of potential hazards. Additional rule-sets can assist the user with adjacent tasks such as risk estimation or decisions about appropriate safety measures [RA_Awad2017, RA_Wigand2020].

Ii-C Simulation-based Safety Testing

The methods from Sec. II-B are based on relatively abstract system models (e.g., control structure diagrams, formal language descriptions, PPR models) that often require strong modeling simplifications and thus, their applicability is limited when it comes to detailed effects such as human motions or human-robot collisions. These effects can be explored better through simulation-based testing. But as simulation models are often complex and safety-critical states rare, creating interesting and critical test scenarios becomes a challenge [FA_Lee2020]

. Often, one uses algorithms for search, optimization, or machine-learning to find simulation conditions that expose hazards of failures


. Different heuristics (e.g. risk metrics or coverage criteria) are used to guide the algorithms

[FA_Alexander2015, FA_Corso2019RewardAugmentation, FA_Klischat2019]. So far, simulation-based safety testing has mainly been applied to autonomous vehicles [FA_Babikian2020, FA_Ding2020, FA_Karunakaran2020, FA_Groza2017] and aerospace systems [FA_Lee2015, FA_Alexander2006].
In robotics, simulation-based testing has been used mainly for verification and validation on the level of individual system components (e.g., controller- or code-validation) [RA_Araiza2016, FA_Uriagereka2019, FA_Ghosh2018]. Recently, however, the idea of using simulation-based testing as a tool for hazard analysis has also gained some interest. In our earlier work, we proposed the use of Monte Carlo Tree Search (MCTS) to find hazardous states in HRC simulation [RA_Huck2020, RA_Huck2021]. A similar concept was explored in [RA_Andersson2021]

, but with Deep Reinforcement Learning instead of MCTS. Other recent publications also consider simulation-based testing as a part of their hazard analysis strategies

[RA_Camilli2021, RA_Lesage2021]. Despite this recent interest, the potential of simulation-based testing for hazard analysis in HRC is still relatively unexplored, especially when compared to the wide-spread use that simulation-based testing methods have found in other domains such as autonomous vehicles.

Iii Goal and Problem Definition

Fig. 2: A cycle of a simple example workflow: The worker transitions from the robot to the storage station (1), picks up a part (2), waits for the AGV to pass (3), transitions back (4), puts down the part in front (5) and waits for the AGV to return (6).

Iii-a Hazard Analysis as a Search Problem

This paper focuses on simulation-based safety testing for HRC systems and, more specifically, on the simulation of worker behavior. Our goal is to expose potential hazards by subjecting simulation models of HRC systems to critical worker behaviors that induce unsafe states. We frame this problem as a search problem with the goal of finding hazardous behaviors in the search space of possible worker behaviors. We assume that all non-human entities (e.g. robots, sensors, etc.) react deterministically for a given worker behavior333This assumption may seem very restrictive at first, but is justified since industrial robots are usually pre-programmed and robust to external influences. For significant non-deterministic effects (e.g. varying payloads and stopping distances), a conservative worst-case assumption can be made.. Assuming a fixed initial state, the behavior of the worker can thus be considered as the only variable on which the resulting simulation states depend. Under this assumption, we can express the search problem as a 4-tuple:


where is the simulation model, the initial simulation state, the set of possible worker behaviors, and a (user-defined) set of unsafe states. The goal is now to find hazardous behaviors for which a simulation starting in results in an unsafe state :


Iii-B Modeling of Worker Behavior

Human behavior can be modeled on different levels from abstract action sequences to concrete motions [MISC_Bobick1997]. In case of a human worker, unsafe states may result from deviations on a workflow-level (e.g., leaving out worksteps or switching their order), on a motion-level (e.g., walking unexpectedly fast or slow), or from combinations of both. We therefore model worker behavior as a tuple consisting of an action sequence

and a parameter vector



where the action sequence denotes the order of worksteps and the parameter vector defines how these worksteps are executed on the motion level (i.e., aspects like walking speed, goals of reaching motions, etc.). Further, denotes the worker’s action space (i.e., the set of all possible worksteps), the length of the action sequence, and the number of parameters. We also assume that there is a known nominal behavior, consisting of a nominal action sequence and nominal parameters . The nominal behavior represents the behavior that is intended and expected in a normal, non-erroneous action sequence. With this worker model, our search problem has a mixed discrete-continuous search space of possible behaviors :


Iii-C Example

To make the previous definitions more concrete, consider the example in Fig. 2. A worker, a stationary robot, and an automated guided vehicle (AGV) collaborate in a cyclic workflow. The worker’s task is to deliver unmachined workpieces to the robot, which processes them and places them on the AGV that carries them away. This means that one cycle is as follows: Initially, the worker stands in front of the robot as a workpiece is processed. After processing, the robot loads the workpiece on the AGV, which takes it away. Meanwhile, the worker transitions to another station (Fig. 2-1) and picks up a new workpiece (Fig. 2-2). The worker waits until the AGV has cleared the area (Fig. 2-3), transitions back to the robot (Fig. 2-4) and puts down the new workpiece in front of the robot (Fig. 2-5). The robot senses the worker’s arrival via a sensor mat (orange area in Fig. 2), moves towards the workpiece and starts the processing routine. Meanwhile, the AGV returns to its initial position. Action space and nominal action sequence are:


with and denoting the actions (transition), (pick up part), (wait) and (put down part), respectively. Possible parameters include the worker’s walking speed and the workpiece placement coordinates on the table444Depending on the desired level of detail, other parameters would be possible as well, e.g. waypoints to describe the worker’s walking path. Here we only consider these three parameters for the sake of simplicity.:


Note that there are potential action sequences. For each sequence, there is a continuous parameter space in . Even with a relatively coarse parameter discretization into five steps, the search space would still consist of possible behaviors.

Iv Proposed Solutions

As the example shows, the search space (i.e., the space of possible worker behaviors) can be vast, even in simple models. Exhaustively simulating all possible behaviors is infeasible. Instead, we propose the following three measures:

Iv-a Rationality Checks based on Workflow-Constraints:

It is usually not the case that any action can take any place in a workflow. Instead, there are certain constraints (e.g., a worker cannot put down a workpiece before picking it up). Using such constraints, we can reduce the search space by excluding infeasible sequences a priori (i.e., before running any simulations). We use a Finite State Machine (FSM) to define which actions are possible in which state of the workflow. The set of feasible action sequences is the set of all action sequences that are accepted by the FSM (here, ”accepted” means that it must be possible to run the full action sequence in the FSM without ending up in a state where the required action is impossible).
For an example, consider Fig. 3, wich shows the FSM corresponding to the example workflow from Sec. III-C. The sequence , for instance, is accepted, whereas the sequence is infeasible, because the second transition (: put down) is impossible: The worker cannot put down the workpiece in state because it has not been picked up previously.

Fig. 3: The finite state machine determines the feasible action sequences. Note: The first index of the states referes to the worker’s position (R: at robot station, S: at storage station) and the second index to the workpiece position (R: at robot station, S: at storage station, H: in worker’s hands)

Iv-B Distance-based prioritization of action sequences:

The second measure is to prioritize sequences that are closer to the nominal sequence . In sequential workflows, erroneous behavior typically results from certain basic errors being superposed to [HUM_Hollnagel1993]. Thus, the majority of erroneous sequences are likely to occur in a certain vicinity around , while highly erratic sequences can be regarded as unlikely. If the computational budget is not sufficient to simulate all feasible sequences, it is therefore reasonable to concentrate on exploring the vicinity of . To quantify how far a given sequence is from , we define a distance metric. Based on [HUM_Hollnagel1993], we assume four basic error types, namely (i) insertion of an action into the sequence, (ii) omission of an action, (iii) substitution of an action by another, and (iv) reversing the order of two adjacent actions555Note that [HUM_Hollnagel1993] lists more than four error types. However, the additional types are special cases of these four. (Note: Since we assume a fixed action sequence length , sequences subjected to an insertion are truncated to length by dropping the last action, and sequences subjected to an omission are extended to length by inserting an arbitrary action at the end). We define the distance between and as the minimum number of basic errors required to convert into . We call this the error distance .
For an example, consider . The following sequence results from switching the last two actions in , whereas the action sequence results from three action substitutions:


Note that the definition of is analogous to the Damerau-Levenshtein distance commonly used to measure the dissimilarity between character strings [MISC_Damerau1964]. Thus, we can use an existing algorithm666http://www.ccpa.puc-rio.br/software/stringdistance/ to efficiently calculate .

Fig. 4: Example of the risk level over simulation time. The high peak corresponds to a collision, the low peaks to protective stops.

Iv-C Risk-guided search:

Thirdly, we use a risk metric to guide the search towards high-risk behaviors. To account for the mixed discrete-continuous search space (compare Eq. (5)), we use a two-stage search algorithm where the top level samples action sequences guided by the risk metric and the bottom level performs parameter optimization to find high-risk executions of a given action sequence (more in Sec. V). Since common risk metrics from safety standards (e.g. [STD_ISO12100, STD_ISO14121]) are typically based on human judgement, they are difficult to calculate automatically. Thus, we define a simplified heuristic metric to quantify the risk:


Here, denotes the human-robot distance, denotes the robot speed777for articulated robot arms, is the cartesian speed of the fastest joint; for AGVs, is the robot’s translational ground speed, is a user-defined speed threshold above which the robot is considered potentially dangerous (a typical choice is 250mm/s, see [STD_ISO10218]). denotes the estimated human-robot collision force, and is a body-region specific collision force limit [STD_ISOTS15066].
The metric differentiates between three types of situations:

  • The robot is slower than and thus poses a negligible danger to the worker; the risk is zero.

  • The robot is faster than , but the distance is greater than zero. Thus, there is no contact (yet), but the robot poses a potential danger to the worker. In this case, the risk is defined on the basis of .

  • The robot speed is faster than and the robot is in contact with the worker (i.e. ). In this case, the risk is defined as the ratio of the collision force to the limit . The addition of +1 ensures that case (c) always returns higher risk than case (b).

For a given behavior , the risk is obtained by simulating , evaluating in each simulation step (see Fig. 4), and returning the highest value that has occurred. It is assumed that exactly one human worker and one or multiple robots are present. For multiple robots, is evaluated separately with respect to each robot and the maximum value is taken.
Since safety constraints in HRC are typically defined in terms of distance, velocity, and contact forces [STD_ISOTS15066], the definition of allows us to express the set of unsafe states via a risk threshold :


For instance, with all contact situations at critical robot velocity are deemed unsafe, whereas with only collisions exceeding are deemed unsafe.

V Search Method

V-a Assumptions, Objective, and General Approach

This section describes a concrete implementation of this approach. We assume that , , and are given and that the lengths of action sequences and parameter vectors are and , respectively. We also assume that the nominal behavior , , parameter limits , and risk threshold are given. The objective of the implemented search method is to find as many hazardous sequences as possible. According to Eq. (2), a sequence is hazardous if parameters exist so that the behavior results in an unsafe state (i.e., a state where is exceeded). As seen in Fig. 1, the general procedure is structured in three phases: In the generation phase, a preprocessing script builds a set of feasible action sequences. These are then fed into the 3D Simulator CoppeliaSim. In the exploration phase, CoppeliaSim simulates the sequences with their nominal parameters to obtain an initial risk estimate. In the exploitation phase, CoppeliaSim deploys a parameter optimizer to find parameters that further increase the risk of the previously explored sequences. We assume that the computational budget is limited in terms of the maximum number of simulation runs which are split between exploration and exploitation phase by a split factor , with simulation runs for exploration and the remaining runs for exploitation.

V-B Generation Phase

The preprocessing script generates a set of feasible action sequences using the FSM introduced in Sec. IV-A. The script iterates through the cartesian product of the action space and checks for each if it is accepted by the FSM. In Algorithm 1, this is represented by ll. 1-7 (Note: The function isFeasible in l. 3 represents the acceptance check).

V-C Exploration Phase

In the exploration phase (ll. 8-17), the budget (i.e., the available number of simulation runs) is allocated according to the split factor . If the number of sequences in is smaller than , the remaining budget is allocated to the exploitation phase. Then, the simulator calculates the distance to for the sequences in and orders the sequences from low to high distance. (Note: For brevity, both the distance calculation and the sorting are represented by a single function in Algorithm 1, l. 14.) The sequences in are simulated with nominal parameters until the budget of simulations is exhausted. Thereby, an initial risk estimate is obtained for each simulated sequence. Due to the ordering, sequences closer to are prioritized if is insufficient to simulate all sequences in .

V-D Exploitation Phase

When is reached, the simulator transitions into the exploitation phase (ll. 18-31) and deploys a parameter optimizer to find parameters that further increase the risk of the explored sequences. Since is limited, usually not all sequences can be optimized. Instead, sequences are prioritized according to their initial risk estimates since high-risk sequences have a better chance of exceeding . For the prioritization, two alternatives are implemented:
1) Strict priority: Sequences are ordered strictly by decreasing risk estimate (Algorithm 1, l. 18).
2) Probabilistic priority:

Sequences are sampled with probabilities proportional to their risk estimate (this variant is not shown in Algorithm 1):


Note that the latter also includes sequences with a low which would potentially be left out by strict prioritization.
Sequences whose initial risk estimate already exceeds are removed from (these are already identified as hazardous, so further simulations would be futile).
For optimization we use a Nelder-Mead Optimizer (NMO). NMO is chosen because it is a direct search method that does not require gradient information. This is necessary because the risk value is obtained from simulation, and not from an analytical function, so gradients are not available. Since NMO is well known, we omit a detailed description and refer to [MISC_Nelder1965]. The optimization loop is shown in ll. 24-28. NMO.init() randomly initializes the NMO algorithm and returns an initial parameter vector . Then, the first sequence is simulated with these parameters, and the resulting risk is given to NMO.iterate(), which performs one step of the NMO algorithm and returns a new . This is repeated iteratively until is exceeded or a maximum number of steps is reached, in which case the algorithm moves on to the next sequence. Penalty functions ensure that parameters remain in . (Note: For brevity, this aspect is not shown in Algorithm 1.) The exploitation phase terminates when the budget is exhausted.

2:for each  do # select feasible sequences
3:     if isFeasible(then
6:     end if
7:end for
8:if  then # alloc. exploration budget
12:end if
13: #remaining budget
15:for  do # exploration
17:end for
19: # remove sequences that already exceed
20: # i: sim runs, j: sequences in
21:while  do
22:      # optimization step counter
24:     do
28:     while 
29:      # go to next sequence
30:      # k sim runs conducted in inner loop
31:end while
Algorithm 1 Search Method Pseudocode

Vi Application Example

Vi-a Scenario

We present an example using the workflow from Sec. III-C (a second example with a slightly more complex workflow is available on GitHub888https://github.com/Huck-KIT/Robot-Hazard-Analysis-Simulation, but is not discussed here for reasons of brevity). Action space and parameters are chosen according to Eq. (6) and Eq. (8), respectively. The nominal sequence is chosen according to Eq. (7). The workflow constraints are given by the FSM shown in Fig. 3 (see Sec. IV-A for details). Our safety criterion is that there must be no collisions where the contact force exceeds of the collision force limit. Thus, we set the risk threshold to (compare Eq. (11)). The set of unsafe states is:


For testing purposes, deliberate safety flaws are designed into the system, so that both robot and AGV can cause an unsafe state if the worker behaves in certain ways that deviate from the nominal behavior (see Fig. 5). The generation phase yields 266 feasible action sequences from the initially 4096 potential combinations.

Vi-B Test Runs and Results

The 266 remaining feasible sequences are searched for hazardous behavior in several different test runs.
1) Comparison of search configurations:

Test runs with a constant split factor were performed to compare different search configurations: Search with strict prioritization (SP), search with probabilistic prioritization (PP), and, as a baseline, a random search (R) where sequences and parameters are sampled from uniform distributions over

and without any form of distance- or risk-prioritization. For each approach, ten test runs were conducted with and , respectively. The results are reported in Table I. For , SP outperforms R in terms of the number of identified hazards, while the error distances of the identified hazards are similar. For , PP slightly underperforms R, but the identified hazards are closer to the nominal behavior than those found by R. Interestingly, PP underperforms both R and SP as it tends to waste simulation time with low-risk sequences that have little chance of exposing any hazard.
2) Split factor sweep: Additionally, tests with a fixed search configuration, but different split factors were conducted. For these tests, was chosen (the same as the number of feasible sequences), so that an exploration-only search corresponds with . Results are shown in Fig. 6. The best results were obtained with a medium split factor ( and ), which is expected as a small leads to insufficient exploration of the search space, whereas a large leaves too few simulation runs for optimization. Yet, one should keep in mind that the influence of is problem-dependent and might look very different in another scenario.

Fig. 5: Left: The worker’s hand collides with the robot after placing the workpiece on the table. Right: Worker and AGV collide.
Fig. 6: Split factor sweep. Search configuration: SP, , . For each , three test runs were conducted, except for , where the search is fully deterministic (exploration only).
Results for Results for
1 27 3.1 36 3.2 24 3.1 20 3.2 19 2.6 12 2.9
2 27 3.2 38 3.2 21 3.1 17 3.2 19 2.8 12 2.8
3 24 2.5 37 3.3 22 2.6 9 3.2 15 2.9 12 2.6
4 27 3.1 32 3.2 26 3.3 20 2.6 19 2.8 10 2.8
5 24 2.8 37 3.2 25 2.8 24 2.6 15 2.4 13 2.8
6 25 3.3 34 3.2 27 2.8 16 3.1 17 3.0 12 2.7
7 29 3.0 32 3.2 23 2.7 12 3.4 18 2.2 10 2.8
8 28 3.0 33 3.2 23 3.0 15 2.9 14 2.5 16 2.6
9 27 3.1 31 3.2 23 3.2 18 2.8 16 2.6 10 2.8
10 29 3.4 30 3.2 23 2.9 21 2.8 17 2.5 12 2.5
Avg. 26.7 3.1 34 3.2 23.7 3.0 17.2 3.0 16.9 2.6 11.9 2.7
TABLE I: Comparison of search configurations. R: Random Search, SP: Search with strict prioritization, PP: Search with probabilistic prioritization. : Number of hazardous behaviors found, : Avg. error distance of hazardous behaviors found. In all cases, and was chosen.

Vii Discussion and Future Work

The presented method enables users to consider effects of deviating worker behavior in their hazard analyses. The proposed simulation-based approach has several benefits: Simulation models are often already available in the development phase, so renewed system modeling is largely avoided (except for modeling the FSM to describe workflow constraints). Also, compared to PPR models or formal language descriptions, simulation captures physical effects (e.g. movements or collisions) more accurately. Finally, the simulation is treated similarly to a black-box model: Apart from action space, parameter ranges, and workflow constraints, no explicit knowledge about the systems’ internal workings is required (e.g., what safety measures are used or how they are configured). This information is contained implicitly in the simulation, but there is no need to formalize it explicitly (as it would be the case e.g. in rule-based methods). This improves scalability to complex systems.
The main challenge is the large search space of possible worker behaviors. Our test example indicates that the proposed search techniques can effectively cope with this, although more extensive tests in different scenarios are needed to confirm the promising initial results. Note that our approach is limited in the sense that it can only falsify (i.e, find cases of safety constraints being violated) but cannot provide a safety guarantee. As with all simulation-based techniques, success depends on the model quality. With faulty or too simplistic models, the simulation might find hazards that do not exist in reality, or, more concerning, real-world hazards might not manifest themselves in simulation. For instance, our example uses a relatively simplistic worker model where certain potentially hazardous behaviors (e.g. deviations in the walking path) are not captured. Future work will therefore focus on the integration of more sophisticated digital human models and on the use of advanced search algorithms. It would also be interesting to test performance against an exhaustive search rather than a random search as a baseline. In the present paper this was not yet done due to the extensive computation time required.
Finally, we emphasize that the proposed method is an addition to existing ones and not a replacement. In the long term, simulation-based testing could be used in conjunction with formal verification to provide more detailed analysis of cases where formal verification indicates potential hazards.


We thank Tamim Asfour for his support and constructive discussions.