On the Generalizability of Motion Models for Road Users in Heterogeneous Shared Traffic Spaces

by   Fatema T. Johora, et al.

Modeling mixed-traffic motion and interactions is crucial to assess safety, efficiency, and feasibility of future urban areas. The lack of traffic regulations, diverse transport modes, and the dynamic nature of mixed-traffic zones like shared spaces make realistic modeling of such environments challenging. This paper focuses on the generalizability of the motion model, i.e., its ability to generate realistic behavior in different environmental settings, an aspect which is lacking in existing works. Specifically, our first contribution is a novel and systematic process of formulating general motion models and application of this process is to extend our Game-Theoretic Social Force Model (GSFM) towards a general model for generating a large variety of motion behaviors of pedestrians and cars from different shared spaces. Our second contribution is to consider different motion patterns of pedestrians by calibrating motion-related features of individual pedestrian and clustering them into groups. We analyze two clustering approaches. The calibration and evaluation of our model are performed on three different shared space data sets. The results indicate that our model can realistically simulate a wide range of motion behaviors and interaction scenarios, and that adding different motion patterns of pedestrians into our model improves its performance.


page 1

page 4

page 13


Modeling Interactions of Multimodal Road Users in Shared Spaces

In shared spaces, motorized and non-motorized road users share the same ...

On Intercultural Transferability and Calibration of Heterogeneous Shared Space Motion Models

Modelling and simulation of mixed-traffic zones is an important tool for...

Investigating the Role of Pedestrian Groups in Shared Spaces through Simulation Modeling

In shared space environments, urban space is shared among different type...

Sub-Goal Social Force Model for Collective Pedestrian Motion Under Vehicle Influence

In mixed traffic scenarios, a certain number of pedestrians might coexis...

Quickest Paths in Simulations of Pedestrians

This contribution proposes a method to make agents in a microscopic simu...

Off The Beaten Sidewalk: Pedestrian Prediction In Shared Spaces For Autonomous Vehicles

Pedestrians and drivers interact closely in a wide range of environments...

I Introduction

Shared space design principles [clarke2006shared] have been drawing significant attention in recent years, as an alternative to traditional regulated traffic designs. In shared spaces, heterogeneous road users such as pedestrians, cars, bicycles share the same space. Unlike traditional traffic environments, in shared spaces, there are no or very few road signs, signals, and markings; this causes frequent direct interactions among road users to coordinate their trajectories.

There is an ongoing debate on the safeness of shared spaces; while some studies state that the lack of explicit traffic regulations makes road users more safety-conscious and may lead to fewer road accidents [hamilton2008shared, clarke2006shared, kaparias2012analysing], others ([clayden2006improving, jenks1983residential]) argue the lack of acceptance and understanding of the concept can compromise safety in shared spaces. Notwithstanding this debate, traditional road designs have been replaced by shared spaces in a growing number of urban areas; some examples are the Laweiplein intersection in Drachten, Skvallertorget in Norrköping, and Kensington High Street in London [hamilton2008shared].

Yet, the lack of explicit rules makes it essential to investigate the safety issues in shared spaces. Modeling and simulation shared spaces by analyzing and reproducing the motion behaviors of road users including their interactions is crucial to assess and optimize such spaces during the planning phase. Realistic simulation models can also form a safe basis for autonomous cars to learn how to interact with other road users.

Interpreting and modeling mixed-traffic interactions pose challenging problems; an interaction can be a simple reaction or a result of complex human decision-making processes, i.e., modifying speed or direction by predicting other road users’ behavior, or communicating with them [rasouli2019autonomous]. Moreover, how one interacts with others is dependent on many factors like their transport mode, current situation, road structures and conditions, social norms (culture), and many individual factors (e.g. age, gender, or time pressure [kaparias2012analysing]).

To the best of our knowledge, so far, there are not many works on modeling and simulation of shared spaces. We observe mostly two different state-of-the-art approaches: (1) physics-based models, mainly the social force model (SFM) of pedestrian dynamics [helbing1995social] including numerous extensions adding, e.g., new forces, decision-theoretic concepts, or rule-based constraints, to describe different types of actors such as cars [schonauer2017microscopic, anvari2015modelling] or bicycles [rinke2017multi]; and (2) cellular Automata (CA) models [lan2005inhomogeneous, zhang2007modeling, bandini2017collision], which are mainly used for modeling mixed-traffic flows in settings with explicit traffic regulations, unlike most shared spaces.

Although these approaches perform well for single bilateral conflicts (i.e., for any point in time, a road user can only handle a single explicit conflict with one other user), they fail in representing multiple conflicts among heterogeneous road users and groups, which are very common in shared spaces. Hence, in our previous works, we integrated SFM with a game-theoretic model to address both bilateral and multiple conflicts among pedestrians and cars [johora2018modeling, ahmed2019investigating]. In this paper, we describe conflict as “an observable situation in which two or more road users approach each other in time and space to such an extent that there is a risk of collision if their movements remain unchanged” as specified in [gettman2003surrogate]; here, we use the terms conflict and interaction interchangeably.

In the literature, motion models do not adequately consider the differences in road users’ behaviors induced by differing environmental settings. These models are usually calibrated and validated using scenarios from a single shared space environment. In [johora2020zone], we took a first step to address this gap by proposing the concept of zone-specific motion behaviors for pedestrians and cars, considering road and intersection zones. In [johoraTL2020], we evaluated the transferability111We use the terms transferability and generalizability interchangeably of our existing model by modeling scenarios that differ from the one used in [johora2020zone] in terms of traffic conditions, spatial layout and social norms. Subsequent results show that our model can suitably replicate the motion of pedestrians and cars from the new scenarios.

In this paper, we delve further into this direction by proposing a conceptually systematic and simple process of modeling general motion models and output a moderate version of a general motion model for pedestrians and cars, by following our proposed modeling process. A general model should be able to reproduce a large variety of motion behaviors of heterogeneous road users ranging from simple free-flow motions to resulted-motions from complex interactions and transferable to new environments with minimal time and effort. The differences between the current work and our previous work ([johoraTL2020]) in terms of model transferability are: (1) In this paper, we build a general model to capture motion behaviors from three data sets with incremental integration of new motion behaviors, and a well-defined and largely automated calibration process to adapt model parameters to the target environment. Whereas in [johoraTL2020], as we did not have any specific process to generate a general motion model, to adapt to the new environment, we had to analyze, consider and explicitly change our model parameters and methods based on the social norms of that new environment, which resulted in different versions of our model, i.e. each version for each different environment. (2) In the current work, the transferability of our model is evaluated using the DUT and HBS data sets as in [johoraTL2020] and also by a new data set (CITR) that contains unique conflict scenarios than the other two data sets (see Section IV).

We further introduce heterogeneity in pedestrian motion by recognizing different motion patterns, by calibrating individual motion characteristics (e.g., sensitivity when interacting with others) and clustering them into different groups 222In this paper, the keyword group is used to represent a set of pedestrians with a similar motion pattern, not the social group, e.g., family members. (see Section VI). The contributions of this paper are:

  • We propose a systematic process to formulate a general motion model.

  • We propose a motion model for pedestrians and cars, which can simulate a large variety of conflict scenarios among road users and evaluate the generalizability of our model by using three different shared space data sets. The results of our evaluation process indicate that our model achieves satisfactory performance for each data set.

  • We present a methodology to recognize and model different motion patterns of pedestrians from real-world data sets. To do so, we investigate several approaches to cluster pedestrians with similar motion patterns into groups. Our evaluation results show that the heterogeneity in pedestrians motion improves the model performance.

Following a review of previous research in Section II, we propose the formulation of a general model for movement modeling of heterogeneous road users in Section III. We illustrate the examined data sets and the architecture of our Game-Theoretic Social Force Model (GSFM) in Section IV and Section V, respectively. Section VI explains the calibration methodology and recognition of different walking styles of pedestrians. In Section VII, we describe how we evaluate model performance and discuss the results. We conclude by outlining future research venues.

Ii Related Works

Existing mixed-traffic motion models are mostly built based on rule-based models (e.g. Cellular Automata (CA) [zhang2007modeling]), or physics-based models, most preeminently the Social Force Model (SFM) [helbing1995social].

CA models describe road users motion behavior by a set of state transforming rules in a discrete environment. They have been used to model motion behaviors of a set of homogeneous road users, e.g., pedestrians [burstedde2001simulation, bandini2017approach], cars [nagel1992cellular, chai2015fuzzy] and there are also few works describing mixed-traffic motion, e.g., [zhang2007modeling] who study interactions among pedestrians and cars at crosswalks, [lan2005inhomogeneous] who model car-following and lane-changing actions of cars and motorcycles, or [chen2018evaluating] who study bicycle-to-vehicle interactions and its impact on traffic delay.

In the classical SFM, introduced in [helbing1995social], the movement of a pedestrian is represented by differential equations comprising a set of simple attractive, and repulsive forces from other pedestrians and static obstacles that he/she experiences at a specific place and time. Even though SFM was initially modeled for pedestrian dynamics [chen2018social, asano2010microscopic, johora2017dynamic], many studies extended it for modeling other types of road users. For example, [yang2018social, zeng2014modified] who include vehicles, considering their impact on pedestrians as separate forces; in [anvari2015modelling], Anvari et al. add new forces and rule-based constraints to handle short-range and long-range conflicts among pedestrians and cars. In [rinke2017multi], SFM is combined with long-range collision avoidance mechanisms to model motion behaviors of pedestrians, vehicles and bicycles.

Both CA-based and SFM-based models can represent simple situations well. However, game-theoretic or probabilistic models are more suitable for complex scenarios where road users must choose an action among many alternatives to handle a given situation [helbing1995social]. In [pascucci2017discrete]

, in case of complex interactions, road users’ choice of action is modeled by a logit model, based on available data but without considering what other users might do. In

[fujii2017agent], Fujii et al. used a discrete choice model to illustrate decision making while in pedestrian interactions. Game-theoretic models have often been applied to interpret human decision-making processes, also in traffic situations. Some examples are the application of non-cooperative games to illustrate merging-give way interaction among vehicles ([kita1999merging]), pedestrian-to-car interaction in shared spaces ([schonauer2017microscopic]), bicyclist-to-car interaction at zebra crossings ([bjornskau2017zebra]), or analyze the difference of cyclist/pedestrian interaction with human-driven or autonomous vehicles in [michieli2018game]. In [lutteken2016using], lane-changing behaviors of cars are modeled using a cooperative game where cars cooperate with each other for collective reward. Whereas, in a non-cooperative game, each player makes decisions by predicting others’ decisions, which is very similar to what real-world road users often do [bjornskau2017zebra].

Although there are several works on modeling motion behavior of road users, only a very few studies consider different motion patterns for individual road user types [kabtoul2020towards, alahi2017learning, yu2014multiagent]. Kabtoul et al. [kabtoul2020towards] manually annotates several predefined pedestrian types based on willingness to give way to a car. Alahi et al. [alahi2017learning]

obtain different movement styles for pedestrians by learning collision avoidance parameters of individual pedestrians and clustering them into groups using the k-means clustering. Their model is restricted to pedestrian-only scenarios. In


, the authors classified pedestrians into groups based on their age range and gender and assigned individual speed profiles to each group. These speed profiles are collected from the literature instead of real-world data sets.

Existing closed-source commercial (e.g., AIMSUN [casas2010traffic] or VISSIM [fellendorf2010microscopic]

) and open-source (SUMO

[behrisch2011sumo]) simulators are somewhat capable of modeling and simulating mixed-traffic at a microscopic level. However, open-source simulators like SUMO have limited means for modeling interaction between heterogeneous road users. To address this issue, some studies combined SUMO with agent frameworks such as JADE ([soares2013agent]) or JASON ([Fiosins+2016arts]); however, adding new environmental features or define new modalities in such models is difficult. Also, SUMO lacks flexibility regarding lane and vehicle geometries, which is restrictive for shared spaces.

Iii Modeling Process

A general motion model should be able to reproduce realistic motion behaviors of road users in different environmental settings in terms of road structures, culture or norm, types of road users, and types of interactions and to adapt to new environments with less time and effort, which make generating such models very challenging.

We propose a systematic process to construct a general motion model in Figure 1. Here, D, A, and M represents the decision, action and merge nodes respectively. The process starts with modeling the free-flow movements of road users (A1) with their type and origin, destination, and speed profiles as input. The next step is to analyze and model interactions among road users. To do so, one can collect and explore a real-world traffic data set (A2) to identify and extract conflict scenarios between two or more road users (A3) to recognize and classify the interactions among the road users (A4) and then model these interactions (A5). Finally, the model needs to be calibrated (A6) and evaluated (A7) both quantitatively (minimize the difference between real and generated trajectories) and qualitatively (reproduce realistic behaviors) by using these extracted conflict scenarios. However, generating a general motion model is a continuous process which requires testing the model with new data sets, i.e., new environments and also adding new modalities. As shown in Figure 1, to evaluate the model performance on a new (D1) data set, it is necessary to check (D2) if there are any new kind of interaction(s), if yes, then this interaction(s) needs to be integrated (A5) into the model. Next, the calibration of all parameters (including the new ones) and the model evaluation on each data set is required. To add a new user type (M1) e.g., integrating vehicle in the pedestrian-only motion model, one needs to go through all the steps in Figure 1. This iterative process of modeling continues until a stopping criterion, such as a certain level of accuracy in realistic trajectory modeling, has been reached. The stopping criterion is application dependent.

Fig. 1: Formulation of a general motion model for mixed-traffic environments.

In this paper, we use this process to output a moderate version of a general model for generating realistic trajectories of pedestrians and cars in different shared spaces, using the HBS, DUT and CITR data sets. Our way of recognizing and classification of interactions (A4), modeling these interactions (A5), the calibration (A6) and evaluation (A7) of the model are discussed in Section IV-B, V, VI and VII, respectively.

Iv Data Sets and Interaction Classifications

Iv-a Data Sets

Fig. 2: The spatial layout of three shared space environments; the top-left sub-figure visualizes the shared street from HBS, the top-right sub-plot shows the roundabout from DUT, the bottom-left sub-plot depicts the intersection from DUT and the bottom-right sub-figure shows interactions from CITR.

We have been developing a motion model of pedestrians and cars, named Game-Theoretic Social Force Model (GSFM) [johora2018modeling, johora2020zone, Johora2020agent], mainly based on the scenarios manually extracted from a street-like shared space environment in Hamburg, Germany (HBS). In this paper, to move towards a general model, we evaluate our model on two other data sets which are different from the HBS data set in terms of spatial structures, types of interactions, and the number of road users. These data sets are the DUT data set from Dalian University of Technology campus in China and the CITR data set from the Ohio State University campus in the USA. All three data sets are visualized in Figure 2, and their details are given below:

  • HBS [rinke2017multi]: The HBS data set collected from a street with pedestrian crossing from both sides. It contains both bilateral and multilateral interactions among pedestrians and cars, with or without car following interactions. We extracted 103 such scenarios from HBS.

  • DUT [yang2019top]: The DUT data set contains trajectories of pedestrians and cars from a roundabout and an intersection. It comprises of car-to-crowd lateral interactions; most scenarios extracted from DUT have a large number of pedestrians compared to the HBS and CITR scenarios.

  • CITR [yang2019top]: CITR is an experimentally designed data set, collected from a university parking lot. It contains several lateral, front, and back interactions among pedestrians and cars.

Here as shown in the bottom-right sub-figure of Figure 2, lateral interaction indicates the situation where pedestrian(s) cross from in front or behind the car. Front interaction is the face-to-face interaction, and in back interaction scenario, car drives behind the pedestrian(s). There are also observable differences in these data sets which can be interpreted as cultural differences. For example, in the DUT data set, road users maintain less inter-distance (i.e., safety distance) compared to the HBS and CITR data sets (see Section VI

). In all three data sets, an agent’s position at each time step (i.e., 0.5 s) is given as a 2D vector in the pixel coordinate system, and they also contain the pixel-to-meter conversion scales. Table

I summarizes the number of scenarios and individuals involved.

Data set # of Scenarios # of Pedestrians # of Cars Time step
HBS 103 206 126 0.5s
CITR 26 208 26 0.5s
DUT 30 607 39 0.5s
TABLE I: Statistics of Datasets

Iv-B Interaction Classification

In our previous works [johora2018modeling, Johora2020agent], we classified road users interactions broadly into two categories based on Helbing’s classification of road agents’ behavior [helbing1995social] and the observation of the shared space video data (mostly HBS): simple interaction (percept act) and complex interaction (percept choose an action from different alternatives act). These interactions can also be sub-categorized based on the number and types of road users involved: simple interaction contains car-following, pedestrian-to-pedestrian, and pedestrian(s)-to-car reactive interactions and complex interaction includes pedestrian(s)-to-cars, pedestrians-to-car and car-to-car interactions. We note that complex car-to-car interaction is not included in this paper.

As mentioned earlier, in this paper, we are still focusing on pedestrians and cars, but we aim to evaluate the performance of our model on the DUT and CITR data sets. According to the process proposed in Figure 1, we analyze these two data sets and detect the following new types of interactions:

  • Unlike HBS, in the DUT data set, sometimes, cars somewhat deviate from their trajectory as a result of reactive interaction with pedestrians. Mostly because of the environment structure in DUT, i.e., more free space for motion of cars.

  • As already discussed in Section IV-A, the CITR data set [yang2019top] contains front and back interactions among pedestrians and cars, which are not observed in the HBS or DUT data sets [rinke2017multi].

How we model these interactions, including integration of new interaction types, is described in Section V.

V Agent-Based Simulation Model

Fig. 3: Conceptual model of pedestrians and cars motion behaviors. Here, AF denotes the added force to classical SFM and A/D signifies activation/deactivation of a module.

We pursue an agent-based model, GSFM, to represent the motion behaviors of pedestrians and cars, initially described in [johora2018modeling]. Here, we give an overview of the architecture of GSFM, visualized in Figure 3. In GSFM, each road users is modeled as an individual agent and their movements are conducted in three interacting modules, namely, trajectory planning, force-based modeling, and game-theoretic decision-making. Each of this module has individual roles. GSFM is implemented on a BDI (Belief, Desire, Intention) platform, LightJason [aschermann2016lightjason], which permits flexible design and explanation of the control flow of GSFM through its three modules. Based on current situation, the BDI controller activates the relevant module, which then informs the controller on completion of its task.

The trajectory planning module computes the free-flow trajectories for all agents by only considering static obstacles (e.g. boundaries, or trees) in the environment. For individuals trajectory planning, we transform the simulation environment into a visibility graph [koefoed2012representations], add their origin and destination positions into the graph and perform the A* algorithm [millington2009artificial].

The force-based module governs the actual execution of an agent’s physical movement and also captures the simple interactions between agents by using and extending the SFM. To model the driving force of agents towards their destination (), the repulsive force from the static obstacles () and other agents (), we use the classical SFM. Here, = for a relaxation time and and denote the desired and current velocities of , = and = , where and symbolize the interaction strengths, and and are the range of these repulsive interactions, and are the distances from to , or to at a specific time, and indicate the normalized vectors. = describes the fact that human are mostly affected by the objects which are within their field of view [johansson2008specification]. Here, stands for the strength of interactions from behind and symbolizes the angle between and . Additionally, we extend SFM to represent car following interaction () and pedestrian-to-car reactive interaction (). If , = , i.e., continues moving towards , otherwise, decelerates. Here, is the minimum vehicle distance, is the normalized velocity of j, and denotes the distance between and (the leader car). emerges only if pedestrian(s) have already begun walking in-front of the car. Then the car decelerates to let the pedestrian(s) proceed. This module also executes the decisions computed in the game module .

As discussed in Section IV-B, the CITR data set contains two new types of interaction, namely, the front and back interaction between pedestrian () and vehicle (). We incorporate these two interactions to our model as a single type, i.e., longitudinal interaction, and following:

If and ( or ( and ) ), we add a temporary goal for the respective pedestrian, where , , and are symbolized in Eq. (1) with , i.e., the dot product of the direction vectors of and , and is the rotation of using rotation theory in Eq. (2) [weisstein2003rotation] and the calculation of and are given in Eq. (4) and Eq. (3) respectively. Thus, = , i.e., continues moving towards to avoid conflict.


In this paper, is set to 10 m. Deviation of cars due to reactive interaction with pedestrian in the DUT scenarios is addressed by , i.e., the SFM repulsive force.

The game-theoretic module controls the complex interactions among agents, e.g. pedestrians-to-cars interaction, using Stackelberg game, i.e., a sequential leader-follower game. In a Stackelberg game, first, the leader decides on a strategy that maximizes its utility by predicting all possible reactions of followers and then, the follower reacts by choosing its best response [schonauer2017microscopic]. The game is solved by finding the sub-game perfect Nash equilibrium (SPNE) i.e., the optimal strategy pair. The Eq. (5) and Eq. (6) depict the SPNE and the best response of the follower, respectively. Here, , , , and , are the leader’s and follower’s strategies, utilities of the corresponding strategies and their strategy sets, respectively.


An individual game manages each complex interaction, and the games are independent on each other. In each game, the number of leaders is fixed to one but the followers can be more. We perform separate experiments with randomly chosen leader, the faster agent as leader (i.e., the car), and pedestrian as a leader. The result suggests that and the faster agent as leader is the best choice. However, if the scenario includes more than one car (e.g., pedestrian-to-cars interaction), then the one who recognizes the conflict first is considered as the leader. To calculate the payoff matrix of the game, as shown in Figure 6, first, all actions of the players are ordinally valued, assuming that they prefer to reach their destination safely and promptly. Then, to express situation dynamics, we select several features by analyzing real-world situations and perform a backward elimination process on the selected features to get the most relevant ones. Let, be an agent which interacts with another agent ; then the relevant features are the following:

  • NOAI: the number of active interactions of i as a car.

  • CarStopped: has value if i (as a car) already stopping to give way to another agent j’, otherwise 0.

  • MinDist: has value - distance(i, j), if distance(i, j) ; its difficult to stop for car i, otherwise 0.

  • CompetitorSpeed: has value , if current speed of j, , otherwise 0.

  • OwnSpeed:

  • Angle:

During game playing, Continue, Decelerate and Deviate (only for pedestrian) are the viable actions for road users. Execution of these actions are performed in the force-based module.

  • Continue: Any pedestrian i crosses a car j from the point if intersects , otherwise continues her free-flow motion. Here, is the direction vector, is a scaling factor, and are the current and final positions respectively. Cars continue by following their free-flow motion.

  • Decelerate: Agents decelerate and in the end stop, if required. For pedestrians, , unless the car is very near (i.e., distance(, ) + + 1 m), in that case pedestrian will stop and in case of cars, .

    is the critical spatial distance.

  • Deviate: A pedestrian i passes a car j from behind from a position (up till j stays within the range of view of i) and afterwards i proceeds moving towards her original goal position.

(a) Pedestrian-to-Car Interaction
(b) Impacts of Situation Dynamics
Fig. 6: The complete payoff matrices for pedestrian-to-car interactions.

Although these modules do not obey any sequence and take control alternatively, at the start of the simulation, GSFM keeps a hierarchy among them. It starts with trajectory planning, assuming that agents plan their trajectories before they begin moving. When trajectories are planned, the BDI controller actives the force-based module to model the physical movement of agents. Conflict recognition and classification are performed at regular intervals (the algorithm is given in [johora2020zone]), and if it detects any complex conflict, then the controller activates the game-based module. As soon as the strategies are decided, the controller activates the force-based module again to execute the chosen strategies. The BDI controller also prioritizes different interactions based on their seriousness, for example, for cars, takes precedence over and obtains priority over car following. The following code fragment depicts the basic elements of a BDI program consisting of beliefs (in pink), plans (in blue), and actions (in black).

lstlistingSample BDI Code Fragment

3+!main <-
4    generic/print("Name", MyName);
5    !!calculate/route; !walk.
7+!calculate/route <-
8    route/calculate.
10+!walk: >>(module(S), generic/type/isnumeric(S) && S==3.0) <-
11    calculate/next/position; !walk.
13+!update/belief(G) <-
14    >>module(S); -module(S); +module(G).
16+!game/decelerate: >> (module(S), general/type/isnumeric(S) && S==1.0) <-
17    stop/moving; !game/decelerate; !walk.

Here, ‘+’, ‘-’, ‘’ signify add (plan or belief), remove (belief) and unification (belief), respectively. The double exclamation mark before calculate/route plan indicates that this plan should run in the current time-step and one exclamation mark before walk says that the plan will execute in the next time-step. An agent can also trigger a plan from the environment. As an example, when the game module decides on the strategies for the road users involved in a conflict situation, it triggers the plan update/belief, and the plan related to the decision, i.e., game/decelerate in this sample (not complete) code fragment.


The process of modeling the movements of any agent i at any time step t in GSFM is summarized in Eq. (7)–(9). Here, i, j, , , , and denote the target agent, competitive agent, static obstacle, model inputs, the position of i in current and next time step respectively. The input profile contains start (), goal (), and speed profile of i. The goal of i

is estimated by extending its last observed position (

) in real trajectory using Eq. (10) with the extended length = 5 m. The weight = 1 for the CITR scenarios, otherwise 0 and = 1 for the DUT scenarios, otherwise 0.


We calculate the desired speed of a pedestrian by identifying the walking portion of his/her trajectory, i.e., where the pedestrian’s speed is larger than a threshold and then, we average all the speed values to obtain . We set . A car’s desired speed is set to: , where is the set of all the speed values of car .

Vi Calibration Methodology

In this paper, we calibrate our model parameters in several steps as visualized in Figure 7

and the calibration is performed using a genetic algorithm (see section


). To recognize different motion patterns of pedestrians from real-world scenarios, we investigate two clustering approaches, namely Principal Component Analysis (

PCA) with the k-means algorithm (step S3), and k-means with step-wise forward selection (FS) method (steps S4 and S6), see section VI-A. The steps in Figure 7 are as follows: We start by performing universal calibration to get one unique set of parameter values for all pedestrians by assuming that in the same situation, they all act similarly.

Fig. 7: The workflow of model calibration.

At the next step, we calibrate the parameters individually for each pedestrian, then cluster individual parameters using the above-mentioned clustering approaches which give us two different sets of pedestrian groups. Next, we perform group calibration (steps: S5, S7 and S8) so that each group has a unique set of parameters values. For the groups (i.e., clusters) that are obtained in step S3, we perform group calibration directly. However, for the groups obtained by completing S4 and S6, we perform group calibration in two different phases i.e., S7 and S8. In S7, we individually calibrate the selected parameters by the FS method for each group, while keeping the rest of the parameters’ values (obtained in S1) same for all groups. Whereas in S8, we calibrate all parameters separately for each group. Each of these approaches above generates a different version of the GSFM model (see section VI-B).

GSFM contains a large set of parameters, which can be broadly classified into parameters for SFM interaction, safety measurements, and payoff matrix calculation for game playing. The SFM and safety-related parameters are listed in Table IV and Table III shows the game parameters. Among these parameters, for grouping pedestrians, we select the sets of parameters given in Table II based on sensitivity analysis. The rest of the parameters are calibrated universally as step S1.

Interaction strength: (PP), (PC), (CP),
Repulsive interaction range: (PP), (PC),
Anisotropic parameter: , Scaling factor for deviate action:
TABLE II: The list of parameters calibrated for clustering

Vi-a Clustering

K-means with Principal Component Analysis

K-means is a simple, fast and widely used clustering algorithm for classifying data based on euclidean distance between the data points, with a predefined number of clusters [marutho2018determination]. In this paper, we decide on the number of clusters using the elbow method [marutho2018determination], and each data point represents the calibrated parameters’ values of an individual pedestrian.

Principal Component Analysis [marutho2018determination] is a technique that reduces a larger number of parameters to a smaller set of parameters which are linear combinations of the original parameters and contains most of their information. As stated in [ding2004k], reducing the dimension of data using PCA is beneficial for k-means. Thus, we use PCA to reduce the number of parameters given in Table II, and then perform k-means on the reduced parameters set to cluster pedestrians into groups.

K-means with Forward Selection

Forward selection is a simple but commonly used feature (or parameter) selection method. It starts with a empty model which contains no parameters, then continue adding the most significant parameter one after another until a predefined stopping criteria has reached or if all present parameters are already in the model [borboudakis2019forward].

Input: Number of clusters , Set of parameters , Predefined score
Output: Set of selected parameters for clustering
= 0;
  // initialize clustering score to 0
while  do
       for each  do
             if  ==  then
                   perform k-means clustering for ;
                   = ;
                   perform k-means clustering for ;
             = silhouette score of clusters;
             if  then
       = ;
       = ;
Algorithm 1 Forward Selection with k-means
Fig. 8: Different pedestrians groups of the DUT data set with different motion patterns.

We calculate the significance of the parameter(s) by executing k-means for some k (i.e., number of clusters) and measure the clustering performance using the silhouette score. This method terminates if a preset value of silhouette score has been reached. The silhouette value is a measure to see if a data point is similar to its own cluster than to others [ROUSSEEUW198753]. Algorithm 1

shows the steps of the forward selection method with k-means. After performing feature selection using Algorithm

1, we perform k-means on the reduced set of parameters to cluster pedestrians into groups with different motion patterns.

Figure 8 shows different clusters of pedestrians from the DUT data set obtained by performing k-means with forward selection and k-means with PCA, from left to right. We conduct these approaches separately on each data set.

Vi-B Calibration

Genetic algorithms (GA) [zames1981genetic]

are evolutionary algorithms, largely applied to tackle optimization problems such as calibration of model parameters

[amirjamshidi2019multi, schiermeyer2016genetic].

As stated earlier, we calibrate our model parameters using a GA. It begins with feeding a random initial set of chromosomes i.e., the set of parameters that need to be calibrated into the simulation model to acquire and compare outputs with real-world data to compute and assign a fitness score to the respective chromosome. Next, an offspring population is generated by performing the selection (of the fittest members), crossover, and mutation processes and fed into the model again unless a specific stopping criterion has reached.

We only consider the parameters in Table II for grouping pedestrians, and we calibrate these parameters as illustrated in Figure 7. Whereas, we calibrate the rest of the parameters of GSFM in beforehand, separately and in two steps: first, we calibrate the remaining SFM and safety parameters and then calibrate the game parameters. We conduct all these calibration steps using the above-described genetic algorithm. To be noted, during individual calibration of pedestrian, we simulate only the target pedestrian and update the states of surrounding agents as their real trajectories.

Selection of the fitness function and simulation output type depends on the types of parameters to calibrate. To calibrate the SFM and safety parameters, GSFM outputs the simulated positions of agent(s) () to compare with their real positions () for calculating the fitness score of any respective chromosome. For the universal and group calibration, the fitness score is calculated by Eq.11 and the fitness function for the individual calibration is given in Eq.12.


Here, , , and denote the number of scenarios, the number of agents, and the number of time steps, respectively. For Eq. 13, the simulated decisions () are obtained by game playing and the real decisions () are manually extracted from the video data. To calibrate the game parameters, calculating the fitness score using Eq. 13 is preferable, as the game module is responsible for deciding on decisions/strategies for agents in any conflict situation, not their motion (see Section V). We use Eq. 13 for calibrating the game parameters for the HBS data set but in case of the CITR and DUT data sets, Eq. 11 is used due to the difficulty on extracting the real decisions manually.

The values of the game parameters are given in Table III. Table IV shows the values of the SFM and safety-related parameters with their calibrated values, where, PP, PC, CP, and CC denote pedestrian-to-pedestrian, pedestrian-to-car, car-to-pedestrian, and car-to-car interactions, respectively.

Symbol HBS Value DUT Value CITR Value
11 4 10.4
1 0 1
11 0 6.3
3 0 0.3
2 0 1.1
1 6.6 0.4
7 5 6.1
7 8 7
5 8 5
8 6 8
TABLE III: List of game parameters with calibrated values
Symbol Description Unit HBS DUT CITR
G1 G2 G3 U G1 G2 U G1 G2 G3 U
() Interaction strength 0.1 0.1 1.9 0.1 0.01 0.1 0.1 0.2 0.1 0.4 0.1
() Interaction strength 15.1 17.3 11.9 11.7 1.6 3.4 4.5 0.2 2.6 0.07 1.5
() Interaction strength 0.76 1.7 2.27
() Repulsive interaction range 0.17 0.24 0.25 0.25 0.17 0.18 0.23 0.1 0.2 0.25 0.18
() Repulsive interaction range 0.1 0.7 0.2 0.91 0.11 0.14 0.27 1.5 0.39 1.1 0.69
Anisotropic parameter 0.35 0.339 0.42 0.35 0.43 0.16 0.41 0.15 0.59 0.52 0.13
Scaling factor for deviate action 7.6 12 7.8 6 6 7 9.01 6.1 8.4 8.3 7.0
Range of view 18.4 10 12.3
() Critical spatial distance 7.8 8 7
Scaling factor for accelerate action 6
Scaling factor for conflict detection 9
() Critical spatial distance 8
Interaction strength for obstacle 10
(obstacle) Repulsive interaction range 0.2
TABLE IV: The list of the SFM and safety parameters with their calibrated values. Here, G1, G2, G3 are the clustered groups.
Model pedestrian Vehicle
GSFM-M1 0.745/0.807/0.338/0.0182 0.654/1.07/0.261/0.033 0.565/0.859/0.1742/0.0037 1.26/3.33/1.083 1.29/3.33/0.795 2.41/5.17/1.153
GSFM-M2 0.747/0.812/0.333/0.0112 0.643/1.06/0.263/0.036 0.546/0.813/0.1754/0.0037 1.33/3.46/1.107 1.22/3.04/0.787 2.46/5.19/1.166
GSFM-M3 0.766/0.854/0.338/0.0138 0.698/1.19/0.260/0.033 0.577/0.878/0.1742/0.0035 1.28/3.39/1.094 1.25/3.27/0.803 2.49/5.28/1.183
GSFM-U 0.754/0.829/0.335/0.0127 0.705/1.22/0.265/0.030 0.577/0.880/0.1740/0.0035 1.30/3.42/1.097 1.41/3.51/0.842 2.49/5.29/1.180
SFM 1.122/1.164/0.376/0.0305 1.499/2.26/0.263/0.036 1.185/1.791/0.2566/0.0123
TABLE V: Quantitative results i.e., aADE(m) / aFDE(m) / SD() / CI of the classical SFM and all versions of GSFM. Here, the bold number denotes the best score.

After performing the clustering and calibration processes, we got several sets of parameters which results in different versions of our model. Specifically, GSFM-M1 which indicates the model with k-means and PCA, GSFM-M2 is the model that combines the forward selection method with k-means and calibrates all parameters given in Table II during group calibration (S8), GSFM-M3 is the model with FS and k-means where only the selected parameters by FS are calibrated in group calibration (S7), and GSFM-U denotes the universal model, i.e., the model with one set of parameters. Due to space restrictions, Table IV visualizes only the the values of parameters in GSFM-U (denoted as U) and GSFM-M2, for each data set. Here, G1, G2, G3 denote the clusters or groups.

Fig. 9: Crowd-to-car interaction from the DUT data set. The first row shows the real trajectories and the second row depicts the simulated trajectories, at two subsequent time steps. Car’s trajectories are in black color.
Fig. 10: Pedestrians-to-cars crossing scenario from the HBS data set. The dotted lines represent the real trajectories and the solid lines are the simulated trajectories. Trajectories are visualized at two subsequent time steps.

Vii Evaluation

As a quantitative evaluation, we compare all our models, namely GSFM-M1, GSFM-M2, GSFM-M3 and GSFM-U and the classical SFM proposed in [helbing2000simulating]. We calibrate all parameters of the classical SFM for each data set using the GA in Section VI-B and the fitness function in Eq. (11), for a fair comparison. The performances of these models are evaluated by the metrics given in Section VII-A on the extracted interaction scenarios from the HBS, DUT and CITR data sets (summarized in Table I). We select three example scenarios among all to evaluate the performance of our model qualitatively. We run all simulations on an Intel Core™i5 processor with 16 GB RAM.

Vii-a Evaluation Metrics

To evaluate the performance of the proposed models in terms of how realistic the resulting trajectories are, we consider two most commonly used metrics [rudenko2020human, alahi2017learning], namely average displacement error (ADE) and final displacement error (FDE), together with two other metrics:

  • Adjusted Average Displacement Error (aADE): ADE computes the pairwise mean square error (in meter m) between the simulated and real trajectories of each agent over all positions and averages the error over all agents. In our extracted scenarios, the trajectory length of agents are different; thus, we choose an adjusted version of ADE to evaluate our models’ performance more precisely: , with as a predefined trajectory length (i.e., number of time steps), assuming that the error in trajectory modeling increases linearly.

  • Adjusted Final Displacement Error (aFDE): FDE calculates the average displacement error (in m) of the final point of all agents. We also adjust FDE like aADE.

  • Speed Deviation (SD): the SD metric is for measuring the pairwise speed difference (in ) of simulated and real speed of each agent over all time steps and averaging these difference over all agents. SD is adjusted as aADE.

  • Collision Index (CI): We choose the CI metric to penalize any collision of pedestrian(s) with the car(s). For each pedestrian , is described as the portion of the simulated trajectory of that overlaps with any car’s occupancy. means no collision. CI is averaged over all pedestrians and adjusted as other metrics.

Vii-B Results

Table V visualizes the performances of the GSFM-M1, GSFM-M2, GSFM-M3, GSFM-U and the classical SFM models on the HBS, DUT and CITR data sets, evaluated using the above-described metrics. In column entries of Table V, for pedestrians, we reported four scores that are aADE, aFDE, SD, and CI, respectively and for cars, three scores are shown as CI is only calculated from the perspective of pedestrians. The bold number indicates the best score. In all criterion, the GSFM-M1 and GSFM-M2 models perform similarly, and both these models outperform the universal model GSFM-U, but GSFM-M3 performs mostly similar to GSFM-U. All versions of GSFM model always perform better than the classical SFM. For all data sets, the average errors of our best-performed model in trajectory modeling, i.e. aADE and aFDE is range from 0.5 m to 1 m for pedestrian, which considers as a good result given the stochasticity in pedestrians behaviors and also similarities with the results presented in [sadeghian2019sophie], a state-of-the-art trajectory prediction model of pedestrians that evaluated by pedestrian-only scenarios. However, the aADE/aFDE scores of our model for vehicles is comparatively higher than pedestrians, i.e. bigger error, mainly for the CITR data set. One reason behind this is the significant difference in simulated and real speeds of vehicles. Thus, improving our vehicle motion modeling, e.g., by considering different motion patterns and speed profiles of vehicles, is part of our future work.

In all cases, the collision index CI is minimal, which indicates all models simulate collision-free trajectories for most of the time. Moreover, in terms of CI, our models perform much better than SFM for the CITR and HBS data sets, but due to higher pedestrian density in DUT, the performance of our models drop and become similar to SFM. For SFM, the entries for cars are empty because the classical SFM can only model pedestrian motions. Thus, in SFM, during the simulation of the extracted scenarios, the cars follow their real trajectories.

Fig. 11: Pedestrians-to-car interaction from the CITR data set. The trajectories of road users: real, simulated in GSFM-U, and simulated in GSFM-M2 are visualized respectively, from left to right.

To show the differences in the DUT, HBS and CITR data sets and the capability of our model to address these differences, we choose one scenario from each data set and simulate each scenario in GSFM-M2. In all Figures 9, 10, and 11, the dotted lines indicate the real trajectory and the solid lines represent the simulated trajectories of road users. In Figure 9 and Figure 10, the real and simulated trajectories are visualized at two specific subsequent time steps. The black lines in Figure 9 and Figure 11 indicate the trajectories of car and the color-coded lines depict the trajectories of pedestrians.

Figure 9 visualizes a crowd-to-car interaction scenario from the DUT data set. Here, the first row shows the real trajectories of the involved road users, and the second row visualizes the simulated trajectories. Most of the DUT scenarios contain a large number of pedestrians, as shown in Figure 9.

Figure 10 depicts a complex pedestrians road crossing example with cars coming from two directions, extracted from the HBS data set. Both in simulation and reality, both cars stop to let the pedestrians cross first, which is a common phenomenon in HBS scenarios.

Figure 11 shows a pedestrians-to-car interaction scenario from CITR. As visualized in Figure 11, GSFM-U simulates all pedestrians in a similar style, while in GSFM-M2, pedestrians follow different motion patterns. Thus, the simulated trajectories of pedestrians in GSFM-M2 are more identical to their real trajectories than the trajectories generated by GSFM-U.

To sum up, in all example scenarios, our model realistically simulates complex interactions among pedestrians and car(s). Table V shows that our model performs satisfactorily for all data sets. Thus, our model was able to model scenarios from new data sets convincingly (i.e. CITR and DUT) with minimal effort compared to traditional approaches (i.e. starting modeling process from scratch for each new case), through the integration of new types of interactions into the model and largely automated calibration process. This evaluates the generalizability of our model. Plus, the results of our quantitative evaluation and the visualization and discussion of the scenario in Figure 11 state that the performance of our model is improved due to heterogeneous motion patterns of pedestrians.

Viii Conclusion and Future Work

In this paper, we proposed a procedure to formulate general motion models and applied this process to extend our Game-Theoretic Social Force Model (GSFM) towards a general model for generating realistic trajectories of pedestrians and cars in different shared spaces. Secondly, we applied and examined two clustering approaches namely, Principal Component Analysis (PCA) with the k-means algorithm and k-means with the forward selection method, to recognize and model different motion patterns of pedestrians.

We calibrated, validated, and evaluated our model using three shared space data sets, namely the HBS, DUT and CITR data sets. These data sets differ from one another in terms of spatial layout, types of interactions, traffic culture and density. In both quantitative and qualitative evaluation process, our model performed satisfactorily for each data set, which evinces that by following a systematic procedure with a well-defined calibration methodology, a shared-space model can adapt to a new environment and model a large variety of interactions. The results also indicate that the heterogeneity in pedestrians motion improves the performance of our model.

Our future research will focus on improving the motion model for vehicles, adding new modalities (e.g., cyclists) into our model, calibrating the model parameters for a wider range of interactions (e.g., vehicle-to-vehicle complex interaction), recognizing different motion patterns of other user types such as vehicles, and calibrating and evaluating our model using more open-source data sets of shared spaces. Most significantly, we shall study large scenarios with a larger number of participants to investigate the scalability of different interaction types and also our simulation model.


This work is supported by the German Research Foundation (DFG) through the Research Training Group SocialCars (GRK 1931) and by the United States Department of Transportation under (#69A3551747111) for the Mobility21 University Transportation Center. We acknowledge the DFG research project MODIS (#248905318) for sharing the HBS data set.