AI perspectives in Smart Cities and Communities to enable road vehicle automation and smart traffic control

04/07/2021 ∙ by Cristofer Englund, et al. ∙ RISE Research Institutes of Sweden 0

Smart Cities and Communities (SCC) constitute a new paradigm in urban development. SCC ideates on a data-centered society aiming at improving efficiency by automating and optimizing activities and utilities. Information and communication technology along with internet of things enables data collection and with the help of artificial intelligence (AI) situation awareness can be obtained to feed the SCC actors with enriched knowledge. This paper describes AI perspectives in SCC and gives an overview of AI-based technologies used in traffic to enable road vehicle automation and smart traffic control. Perception, Smart Traffic Control and Driver Modelling are described along with open research challenges and standardization to help introduce advanced driver assistance systems and automated vehicle functionality in traffic. To fully realize the potential of SCC, to create a holistic view on a city level, the availability of data from different stakeholders is need. Further, though AI technologies provide accurate predictions and classifications there is an ambiguity regarding the correctness of their outputs. This can make it difficult for the human operator to trust the system. Today there are no methods that can be used to match function requirements with the level of detail in data annotation in order to train an accurate model. Another challenge related to trust is explainability, while the models have difficulties explaining how they come to a certain conclusion it is difficult for humans to trust it.



There are no comments yet.


page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Smart Cities and Communities (SCC) is an emerging research field that spans in many dimensions and is promoted by major advances in technology, changes in business operation, and the overall environmental challenge. This paper reviews and positions the authors research domain and interest within AI in Smart Cities while also pointing out the research challenges for future research. One of the enabling technologies for SCC is Information and Communication Technologies (ICT) that connects infrastructure, resources, and services with data surveillance and asset management systems. Another technology is Internet of Things (IoT), which enables even the smallest device to connect to internet and share its operational status Belli et al. (2020); Scuotto et al. (2016). Devices from e.g. the transportation system, power plants, or residential houses can be connected with the use of IoT technology Arasteh et al. (2016). Modern business environments are highly competitive, and organizations are constantly finding ways to reduce cost. In addition, businesses are exploring new ways of developing their operations, and conversion towards not only providing a product but also a service connected to the product, which is becoming popular in many domains (Tukker, 2015). Economics can be seen as the main driver for industry, and the environmental challenge as the main driver for political and private actors (Gärling and Schuitema, 2007). We have a great responsibility to protect our natural resources for our descendants.

To set free the potential of SCC, in any application area, the collected data needs to be processed and analyzed. With the help of AI, relationships, root causes, and patterns can be found in the data. AI can then use the new information to tailor guidance and provide suggestions to users on how to improve behavior (Byttner et al., 2011; Englund and Verikas, 2008).

There, still, exist various challenges related to SCC, some of which are listed in the next section.

Challenges addressed by SCC

Business operation has changed dramatically during the last 60 years. GDP changes from 1947-2009 clearly show a decrease in industry and growth in professional and business services. In the US, the GDP decrease in industry is around 50%. On the other hand, the growth in GDP for professional and business services is 400% (gdp, 2021). This trend is holding on and in the last 17 years the GDP service share has increased from 72.8% to 77.4% of GDP whereas industry share has decreased from 22.5% to 18.2% (Departmen, 2020).

Digitalization, with help from ICT, is a major contribution to this trend and with the latest technology trends in computing power, AI is becoming a key technology to make use of the data to further develop the services.

Another challenge is energy consumption and in particular energy that comes from non-renewable sources such as oil. In Europe, the primary energy consumption was 1 561 million tonnes of oil equivalent (Mtoe) in 2017, 5.3% above the EU target for 2020. In 2018 energy came from petroleum products including crude oil (36%), natural gas (21%) renewable energy (15%), solid fossil fuel (15%), nuclear energy (13%)  (Eurostat, 2020). The energy consumption by sector in the EU breaks down in the following way: the industry sector (31% of final energy consumption), the transport sector (28%), households (25%), services (13%) and agriculture & forestry (2%) (Eurostat, 2019).

In this paper, we primarily address transportation within SCC and how AI can be used to improve efficiency and thus, reduce the energy consumption. AI can be used to learn traffic behavior and to control traffic both on the micro level in e.g. intersections (Chen and Englund, 2016) and on the macro level (Englund et al., 2014a).

In the 28 EU Member States (EU-28) energy consumption in transport has increased with 32% between 1990-2017. In the EU-13 states, the increase is 102% during the same period. Road transport accounts for 73% of the total energy consumption in the transport sector, and road transport alone has increased 34% between 1990-2017 (Agency, 2019).

In 2020, the expectation due to the Covid-19 pandemic, is that the energy demand is 10% below the 2019 levels. This would be twice the decline experienced throughout the financial crisis 2008-2009. CO2 emissions in the EU declined by 8% during the first quarter of 2020 compared with the same period in 2019 (EU2, 2021).

Traffic safety is also a global challenge and traffic accidents have become one of the most common causes of death among young people (World Health Organization (WHO), 2015). Although fatalities have decreased for motorists in most countries, this is not the case for vulnerable road users (VRUs), (Niska and Eriksson, 2013) including pedestrians, bicyclists and moped riders. In Europe 22 700 people lost their lives in 2019 (Commission, 2019a) and more than 1.4 million people were injured in 2018 (Commission, 2018). Worldwide, 1.35 million people lost their lives and up to 50 million were injured in traffic accidents in 2018 (Organization, 2019).

Given these facts about business development, energy and traffic safety, it is clear that there is huge potential in society for improvements, both in terms of energy efficiency and traffic safety.

With a starting point in SSC, the enabling technologies and the challenges mentioned above, this paper will focus on how AI can be used to enable energy saving and improved traffic safety within SCC. Consequently, the EU has set goals on energy consumption in the transportation sector. By 2030 the climate and energy goals for a competitive, secure and low-carbon EU economy, state that the greenhouse gas (GHG) emissions should be reduced by 55% below the 1990 level and the amount of renewable energy should be at least 27% (EuC, 2021; Commission, 2014).

With the trend towards increased vehicle automation, there is a large potential for reducing the effects of an accident, or, if possible, avoiding the accident completely. This can be done by IoT; i.e. building sensor-based safety systems that can detect VRUs and give warnings or actively react on the information. Enabling the development of such systems requires knowledge of how these road users behave, and how that behavior can be described so that the automated vehicle functions can make correct interpretations and decisions. Also, vehicular communication can help traffic coordination and reduce travel time for e.g. emergency vehicles (Barrachina et al., 2014; Englund et al., 2016; Englund et al., 2014a). Consequently, the European commission has set goals on traffic safety, i.e. close to zero deaths in 2050 and an interim goal, halving the number of seriously injured by 2030 from the level of 2020 (Commission, 2019b).

Energy efficiency along with traffic safety are the two main goals of SSC in the domain of the transportation system.

With the help of AI and data analytics it may be possible to improve utilization of the manageable assets within the transportation system. In particular, this paper describes the on-board AI-based systems along with the infrastructure AI-based systems that build up SSC addressing traffic safety and efficiency. An overview is given for the research areas of perception, traffic control, and interaction.

The rest of the paper is organized as follows. Section 2 describes research initiatives, projects and financial programs. Section 3 describes the different approaches of using AI in SCC such as perception, traffic system control, and driver monitoring. Section 4 highlights open research questions and standardization to facilitate implementation and adoption. Finally, Section 5 provides a summary and conclusion of the findings.

2 Research initiatives within Smart Cities and Communities

In the EU, the European Innovation Partnership on Smart Cities and Communities (EIP-SCC) is an initiative supported by the European Commission that brings together cities, industry, Small and medium-sized enterprises (SMEs), banks, research and other smart city actors to share information and find partners for projects (EU-, 2021b).

The EU project CITYKeys is funded by the European Union HORIZON 2020 programme. In collaboration with cities, the project developed and validated key performance indicators along with data collection procedures for common and transparent monitoring to be able to compare smart city solutions across European cities (Cit, 2021). The project has divided the Smart City into sub themes that concern e.g. diversity and social cohesion and aim at promoting diversity, community engagement and social cohesion within a community. Education focuses on improving accessibility and quality of education for everyone. Safety concerns lowering the rate of crime and accidents. Health focuses on improving quality and accessibility of public health systems, for everyone, as well as encouraging a healthy lifestyle. Quality of housing and the built environment promote development of mixed-income areas, ensure high quality and quantity of public spaces and recreational areas, and improve affordability and accessibility to good housing for everyone. Finally, quality of life also promote access to (other) services that focus on providing better access to amenities and affordable services in physical and virtual space for everyone. CITYKeys also aims to harmonize data collection from the involved cities that can enable comparisons of the performance of the measures introduced to reach the EU energy and climate targets.

CIVITAS (EU-, 2021a) is a pan European network of cities that are dedicated to cleaner and better transports. The network is financed by the European Commission and was launched in 2002. Since then, 85 cities have joined the network and more than 900 measures and urban transport solutions have been tested and implemented. The main goal of CIVITAS is to make it easier for cities to obtain cleaner and better connected transport solutions in Europe and beyond. The four main characteristics of CIVITAS are a living lab approach to bring out research projects, maintaining a network of cities for cities, facilitating a public private partnership, and promoting political commitment. The network gives unique opportunities for practitioners to experience innovative transport solutions and learn from experts in the field. Sustainable mobility is the overall area and the project is divided into 10 sub areas: Car-Independent Lifestyles; Clean Fuels & Vehicles; Collective Passenger Transport; Demand Management Strategies; Integrated Planning; Mobility Management; Public Involvement; Safety & Security; Transport Telematics; Urban Freight Logistics.

In Sweden where this research is brought out, there are a number of initiatives from the Swedish government to support the development of SCC. The Strategic Innovation Program (SIP) Drive Sweden is a Swedish cross disciplinary collaboration platform driving the development towards sustainable mobility solutions for people and goods (dri, 2021). Drive Sweden is an important stakeholder in future mobility that fertilizes national as well as international collaboration to encourage the development of future sustainable mobility. Drive Sweden is financed by the Swedish Innovation Agency, Vinnova, the Swedish Energy Agency and the Swedish research council for sustainable development, Formas. Drive Sweden also provides a weekly newsletter that summarize news within the area of smart mobility (Dri, 2021). Examples of projects that has been financed by Drive Sweden, within the field of future mobility, are for example: Study of communication needs in interaction between trucks and surrounding traffic in platooning, Intelligent and self-learning traffic control with 3D & AI and Security for autonomous vehicles from a societal and safety perspective .

InfraSweden2030 (inf, 2021) is another Strategic Innovation Program in Sweden. Whereas Drive Sweden focuses on future mobility, InfraSweden2030 focuses on the transportation infrastructure of the future. InfraSweden2030 is also financed by Vinnova, the Swedish Energy Agency and Formas. The aim of the program is to contribute to reduced climate and environmental impact from the construction, operation and maintenance of the transport infrastructure. The programme organizes seminars and workshops to facilitate collaboration and innovation within the Swedish transport infrastructure sector in order to address the society’s economic and social challenges. In addition, InfraSweden2030 funds research projects that address these goals and challenges. The three objectives of the project are Develop innovation for transport infrastructure, create an open, dynamic and attractive environment, and Reduce impacts on the environment and climate. An example of a project financed by the InfraSweden2030 programme is the iBridge project. The overall aim of the project is to automate and make available knowledge about bridges that can be used to lower maintenance cost.

Viable Cities (Via, 2021) is also a Swedish Strategic Innovation Program. Viable Cities focuses on smart sustainable cities. The vision of the programme is to accelerate the transition towards inclusive, climate neutral cities until 2030 with a good life for all with the help of digitalization and citizen engagement. Viable Cities is like its siblings Drive Sweden and InfraSweden2030 financed by Vinnova, Swedish Energy Agency and Formas. Within the Viable Cities programme, a strategic initiative, Viable Cities Transition Lab is formed to foster capabilities to handle the societal challenges within climate and environment transition. The Transition Lab aims at luring the full potential out of humans in the era of digitalization and automation–thus, obtaining new methods to transform society for an equal and circular economy, and responsible and ground-breaking technology to create behavioral change to a more sustainable and entrepreneurial society. Xplorion - Residential mobility service in car-free accommodation is a project financed by Viable Cities (Xpl, 2021). It offers a mix of mobility services such as public transport, carpool and bicycle pool, to households in a new residential area called Södra Brunnshög in Lund Sweden. The aim is to provide mobility as part of the residential rent and thus, allow more efficient use of transport leading to a reduction in emissions from residents’ travel. It is also expected that by connecting housing and mobility, there will be synergies that will make the resources in the system more efficient than today.

Smart City Sweden (Sma, 2021) is a governmental export platform for sustainable city solutions. The platform reaches out to international delegates who are interested in investing in smart & sustainable city solutions from Sweden. The platform has five focus areas: Mobility; Climate, Energy & Environment; Digitalization; Social Sustainability; and Urban Planning. Through their web page Smart City Sweden promotes solutions ranging from smart production of biogas from household waste, and water treatment facilities to congestion pricing solutions in Stockholm, future multi-modal transportation services and service platforms aiming at supporting an automated transportation system.

3 Approaches

Figure 1: A sample smart city scenario. Information is continuously shared among different buildings, infrastructures and vehicles. Thus, autonomously operating vehicles safely react to detected VRUs.
Strategical Tactical Operational
In-vehicle x x x
Infrastructure x x
Table 1: Overview of in-vehicle and infrastructure-based systems’ contribution to road vehicle automation.

This section describes different approaches of using AI in SCC. The paper is founded from the perspectives of the authors own research, the paper initially gathers previous research in the SCC domain and secondly, in regards to operational, tactical and strategical vehicle and traffic functions, the paper describes future challenges from a speculative design approach to enable road vehicle automation and smart traffic control.

Figure 1 illustrates a sample smart city scenario where information is continuously shared among different units such as smart buildings, vehicles and infrastructures to enable road vehicle automation and smart traffic control applications. In Figure 1 a sample of a smart city is illustrated. Information is continuously shared among different buildings, infrastructures and vehicles. Thus, autonomously operating vehicles safely react to detected VRUs.

Table 1 shows how in-vehicle and infrastructure-based systems contribute to the different levels of control in automated driving. The driving tasks can broadly be categorized into three different levels i.e. strategic, tactical and operational  (Aramrattana et al., 2015). The strategic

tasks comprise high-level (and longer-term) planning decisions such as route choice, traffic flow control, and fuel cost estimates etc., whereas

operational tasks include low-level (short-term) and continuous routine tasks such as lateral control based on immediate environmental input, and in-vehicle input such as driver monitoring. The tactical tasks fall in the middle of the two and are mid-level, medium-term tasks including, but not limited to, turning, overtaking, gap adjustment, merging etc., based on local awareness around the vehicle. In the following subsections we describe Perception systems enabling situation awareness for autonomous vehicles, and approaches for Traffic system control and finally we give examples of Driver monitoring systems.

3.1 Perception

Mobility within SCC concerns several of the challenges described above, e.g. traffic safety and environmental impact. These challenges in turn drive the technology development towards improved sensor systems that can improve vehicles’ perception to help the driver in hazardous situations with e.g. advanced driver assistance systems (ADAS). ADAS are functions that automate vehicle functions to improve safety or comfort. Examples of such functions are lane keeping aid, automated emergency braking, adaptive cruise control. Nevertheless, with the help of sensors and AI the vehicles’ perception system is becoming more and more intelligent and there are now several examples of highly automated vehicular systems. Realizing road vehicle automation builds on the assumption that the vehicle can maneuver automatically by itself. This requires local awareness around the vehicle to be able to handle obstacles, hazardous situations, and unanticipated events. One way to achieve local awareness is through on-board sensors. Camera, radar, and LiDAR (light detection and ranging) sensor signals are typically fused to obtain scene understanding 

(Sivaraman and Trivedi, 2013). Such sensors operate in the range from a few centimeters to 200 m (Kocić et al., 2018).

Another way to obtain awareness in traffic is to use sensors in the infrastructure (Englund, 2020b, a) and using wireless communication to exchange information between vehicles and infrastructure (Lidström and Larsson, 2009). This section describes perception systems enabling situation awareness in traffic.

Scene understanding is an essential prerequisite for autonomous vehicles to increase the local awareness. Semantic segmentation and object detection are two fundamental early lower-level perception components which help in gaining a rich understanding of the scene. Safety-critical systems, such as highly automated vehicles, however, require not only highly accurate but also reliable predictions with a consistent measure of uncertainty. This is because the quantitative uncertainty measures can particularly be propagated to the subsequent units, such as decision-making modules that lead to safe maneuver planning or emergency braking, which is of utmost importance in safety-critical systems. Therefore, semantic segmentation and object detection integrated with reliable confidence estimates can significantly reinforce the concept of safe mobility within SCC.

Given an image or point cloud data stream, there exist two mainstream deep learning-based AI approaches used for real-time object detection tasks

(Liu et al., 2020): two-stage and one-stage detection frameworks. Two-stage methods (Girshick et al., 2014, 2015; He et al., 2017; Ren et al., 2016)

initially have a preprocessing step to generate category-independent object region proposals. The output of this step is then passed to the category-specific classifier, which returns the category label for each detected proposal. On the other hand, one-stage detectors

(Szegedy et al., 2013; Redmon et al., 2016; Liu et al., 2016; Law and Deng, 2018)

are region proposal-free frameworks where the proposal generation step is not separated, and thus the entire framework works in a unified end-to-end fashion. Such unified approaches directly predict class probabilities together with the bounding boxes using single feed-forward networks. In contrast to unified (one-stage) models, region-based (two-stage) methods achieve relatively better detection accuracies with the cost of being computationally more expensive. Unlike two-stage networks, detection accuracy of a one-stage model is less sensitive to false detections coming from the backbone network.

Regarding the task of semantic scene segmentation, advanced deep neural networks are heavily used to generate accurate and reliable segmentation with real-time performance. Most of these approaches, however, rely on camera images 

(Kendall et al., 2015; Zhao et al., 2017; Chen et al., 2018; Chen et al., 2018; Poudel et al., 2019), whereas relatively fewer contributions have discussed the semantic segmentation of 3D LiDAR data (Wu et al., 2018; Milioto et al., 2019; Cortinhal et al., 2020). The main reason is that unlike camera images, LiDAR point clouds are relatively sparse, unstructured, and have non-uniform sampling although LiDAR scanners have wider field of view and return more accurate distance measurements.

As comprehensively described in (Guo et al., 2019), there exist two mainstream deep learning approaches addressing the semantic segmentation of 3D LiDAR data only: pointwise and projection-based neural networks. The former approaches operate directly on the raw 3D points without requiring any pre-processing step (Qi et al., 2017, 2018; Landrieu and Simonovsky, 2018), whereas the latter project the point cloud into various formats such as 2D image view (Milioto et al., 2019; Aksoy et al., 2019; Cortinhal et al., 2020; Wu et al., 2018, 2019) or high-dimensional volumetric representation (Zhang et al., 2018; Zhou and Tuzel, 2018). There is, however, a clear split between these two approaches in terms of accuracy, runtime, and memory consumption. Projection-based approaches can achieve state-of-the-art accuracy while running significantly faster. Although point-wise networks have slightly fewer parameters, they cannot efficiently scale up to large point sets due to the limited processing capacity, thus, they have a longer runtime.

When it comes to uncertainty estimation, Bayesian Neural Networks (BNNs) are intensively used since such networks can learn approximate distribution on the weights to further generate prediction confidences. There exist two types of uncertainties: Aleatoric which can quantify the intrinsic uncertainty coming from the observed data and epistemic where the model uncertainty is estimated by inferring with the posterior weight distribution, usually through Monte Carlo sampling. Unlike aleatoric uncertainty which captures the irreducible noise in the data, epistemic uncertainty can be reduced by gathering more training data. For instance, segmenting out an object that has relatively fewer training samples in the dataset may lead to high epistemic uncertainty whereas high aleatoric uncertainty may instead occur on segment boundaries or distant and occluded objects due to noisy sensor readings which are inherent in the sensors. Bayesian modelling helps estimating both uncertainty types.

Gal et al. (Gal and Ghahramani, 2016)

proved that dropout can be used as a Bayesian approximation to estimate the uncertainty in classification, regression, and reinforcement learning tasks while this idea was also extended to semantic segmentation of RGB images by Kendall

et al. (Kendall et al., 2015). Recently, both uncertainty types were applied to 3D point cloud object detection (Feng et al., 2018), optical flow estimation (Ilg et al., 2018), and semantic segmentation of 3D LiDAR point cloud data (Cortinhal et al., 2020).

Typical algorithms for sensor fusion include Kalman filters 

(Welch and Bishop, 1995). Kalman filters are recursive filters that estimate the state of the system from several noisy measurements. This technology is use, for example, in (Lidstrom et al., 2012) where a vehicle for the Grand Cooperative Driving Challenge was developed. Also in (Kianfar et al., 2012) Kalman filters were used for sensor fusion and scene understanding.

For autonomous vehicles to behave efficiently and safely in traffic they require not only scene understanding derived from perceptual data, but also algorithms that can model and predict other road users’ behavior. Modelling of the behavior and prediction of motions has long been of interest and is applicable especially in domains where humans and intelligent systems co-exist (Rudenko et al., 2019). In a survey (Lefevre et al., 2014), which addresses motion-prediction applications in intelligent vehicles, the authors proposed three main categories for how agent motion is modelled: physics-based, maneuver-based and interaction aware approaches. Work that focus on the acceleration and deceleration behavior of different vehicle types employ physics-based methods, e.g. (Bokare and Maurya, 2016) and (Maurya and Bokare, 2012). (Lefevre et al., 2013) suggest an interaction aware method for instance for risk assessment in traffic. Maneuver-based approaches assume that the maneuver intention can be recognized early on and future trajectory should match that maneuver. The main idea in this approach is that real-world trajectories from a road agent can be clustered into categories representing different behaviors. Based on a set of behaviors, maneuver-based motion prediction approaches employ estimation techniques, for instance Gaussian Processes (Christopher, 2009; Joseph et al., 2011) to then estimate most probable future maneuvers. Deep-learning techniques have also been applied to cluster vehicle encounters (Li et al., 2018).

Roundabouts play a very important role in modern traffic infrastructure. Studies have shown that roundabouts reduce injury crashes (in comparison to signal-controlled intersections), can reduce delays and improve traffic flows, and even have lower long-term cost compared to signal-controlled intersections (WSDT, 2019)

. A study that employs support vector machines to classify vehicles inside a roundabout to either stay or leave the roundabout is presented in

(Zhao et al., 2017). Similarly, a study to estimate the effects of the roundabout layout on driver behavior, employing simulation data is presented in (Zhao et al., 2017). A method for estimating reachable paths using conditional transition maps is presented in (Kucner et al., 2013). A study that employs a stereo camera setup for time-to-contact estimation is presented in (Muffert et al., 2012). This study is focused towards risk assessment instead of efficiency and smoothness of drive, when entering a roundabout.

Recent work by Muhammad and Åstrand (Muhammad and Åstrand, 2018) apply particle filters to predict road user behavior. In (Muhammad and Åstrand, 2019) they address the problem of modelling and predicting agent behavior and state in a roundabout traffic scenario. They present three ways of modelling traffic in a roundabout based on () the roundabout geometry (which can be generated using drawings or satellite images, etc.); () mean path taken by vehicles inside the roundabout; and () a set of reference trajectories traversed by vehicles inside the roundabout. The roundabout models were compared in terms of exit-direction classification and state (i.e., position inside the roundabout) prediction of query vehicles inside the roundabout. The results show that the model based on a set of reference trajectories is better suited, in terms of both the early and robust exit-direction classification, and a more accurate state prediction. An additional experiment was done by categorizing vehicles into classes based on vehicle size (instead of a single class). Results indicate that such a categorization can affect, and in some cases enhance the state prediction accuracy. The particle filter approach in (Magavi, 2020)

was compared to a Recurrent Neural Network (RNN), namely, Long Short-Term Memory (LSTM)

(Gers et al., 2000) to determine the specific behavior model. Additionally, the network performance was compared with other RNN architectures such as the Bi-LSTM and Bi-LSTM + LSTM stacked architecture to evaluate which model has the best performance. Results showed a LSTM network can predict the exit of the vehicle in a roundabout much sooner than the particle filter method and performs equally good when predicting the state of the vehicle in a roundabout.

Englund (Englund, 2020b, a) used real-world trajectories to predict the intention of cars or bicycles in an upcoming road exit. The AI methods used in (Englund, 2020b, a)

were based on Support Vector Machines 

(Vapnik, 1998)

, Random Forest

(Breiman, 2001)

and Multi-Layer Perceptrons

(Bishop, 1995). In (Englund, 2020b) a backward elimination strategy for selecting the most important variables for predicting the behavior of the cars and bicycles in intersections was used. For bicycles the most important variables were speed, and for cars it was position. Heading was also among the 6 best variables for both vehicle types.

Garcia et. al. (Garcia et al., 2017) propose a mix of an Unscented Kalman Filter and Joint Probabilistic Data Association to fuse sensor reading from a vision-based system, a laser sensor and a global positioning system to obtain obstacle detection and object tracking. Li et. al. (Li et al., 2013) suggest combining LiDAR and vision-based sensors to obtain lane detection and extraction of an optimal drivable region. Sivaraman and Trivedi (Sivaraman and Trivedi, 2013) review on-road vision-based vehicle detection, tracking, and behavior understanding. Beside Kalman filters, other algorithms such as Support Vector Machines (Vapnik, 1998), Adaboost (Freund et al., 1999)

, Hidden Markov Models 

(Jazayeri et al., 2011)

and Gaussian mixture modelling 

(Wang and Lien, 2008)

are used for various fusion tasks. Recently, also deep learning in terms of Generative Adversarial Networks (GANs) has been used for sensor fusion of radar and camera sensor data.

In the context of multimodal object detection, most of the recent works fuse RGB camera images with 3D LiDAR point clouds (Qi et al., 2017; Chen et al., 2017; Liang* et al., 2019), whereas other works rather combine regular RGB data with thermal (Takumi et al., 2017) or depth images (Mees et al., 2017). In contrast to object detection, there are relatively few contributions related to multi-model semantic segmentation: (Valada et al., 2019) fuses RGB, depth, and thermal images and (DK. et al., 2018) combines RGB and LiDAR data streams for semantic segmentation.

One of the main challenges in multi-model perception is when to fuse various sensory readouts (e.g. RGB, LiDAR, etc.) which have vast variations in time scales, dimensions (i.e. 2D versus 3D data), and signal types (i.e. continuous versus discrete). Deep neural networks, which are good at extracting and representing features hierarchically, provide various options to fuse sensor readings at different stages such as early, middle, and late. Early fusion (Xu et al., 2018) directly merge raw data derived from different sensor modalities, e.g. first by concatenating raw scene features derived from different sensor modalities into a single vector and then training a deep neural network on this new feature representation. Late fusion (Wang and Jia, 2019) combines learned unimodal sensor features at the highest network layer into a final prediction. Middle fusion (Hazirbas et al., 2016) combines features learned at intermediate layers.

In contrast to other fusion strategies, early fusion requires less computation time and memory since the raw data readings are jointly processed. However, such methods are inflexible to changes in the network input type. For instance, when a new sensing modality is introduced (Tavares de Araujo Cesariny Calafate et al., 2016; Alvear et al., 2018), early fusion networks need to be retrained from scratch. In such cases, late fusion approaches are more flexible and ideal since only the domain-specific network needs to be retrained while the networks processing the other sensor data types remain the same. Although middle fusion networks are also relatively flexible, the network architecture design, i.e. finding the optimal combination of intermediate features, is non-trivial. Despite having advanced fusion networks (Xu et al., 2018; Wang and Jia, 2019; Hazirbas et al., 2016) that achieve state-of-the-art performance on challenging object detection and semantic segmentation datasets, lack of guidelines for designing optimal fusion networks still remains as a challenge since most networks are designed by empirical results (Feng et al., 2019).

3.2 Traffic System Control

To plan the future transportation system, simulation tools are efficient tools. Besides giving realistic visualizations of the future transportation system, the simulation models can provide valuable information on how the system functions under different conditions. Simulation of Urban Mobility (SUMO) is a popular open source traffic micro simulation tool 

(Krajzewicz et al., 2012, 2002). It was used to simulate smart infrastructure that could improve traffic flow and energy efficiency (Englund et al., 2014b). One of the challenges with simulation is to validate the results. Building infrastructure is costly and the planning horizon is 50 years. To improve the simulation models, and to adapt to future vehicles that will have different level of automation, researchers at UC Berkeley have proposed a plug in for SUMO called Flow (Kheterpal et al., 2018; Wu et al., 2017). Flow is developed to take into consideration both fully automated, semi-automated and manually driven vehicles into the simulation. The automated vehicle models take into consideration information from surrounding vehicles and infrastructure to be able to optimize the traffic behavior (Wu et al., 2017).

A review on intersection management is given in (Chen and Englund, 2016). The paper discusses control strategies in signalized and non-signalized intersections. Four types of strategies have been investigated. Cooperative resource reservation concerns how vehicles reserve the tiles on their planned route for certain time slots to pass the intersection. Whereas resource allocation considers time slots and space tiles in for example intersections and roundabouts, trajectory planning concerns the scheduling of travel routes e.g. trajectory planning. Another strategy is to use virtual traffic lights to control traffic. The final approach is collision avoidance and is a complement to the above-mentioned ones. The resource planning or scheduling tools may have one plan, however the vehicle may have unknown constraints or deficiencies and therefore, collision avoidance for input control adjustments can be applied to make sure perpetual safety, e.g. collision avoidance in both short and long term.

Graph Neural Networks (GNNs) have shown great potential to use existing traffic data to model future transportation systems and enable to perform counterfactual reasoning about factors that affect it. DeepMind in a collaboration with Google Map (Lange and Perez, 2020) has shown that the prediction of traffic and estimated arrival time improves by once the problem is formulated using GNNs. The graph represents the road structure and the artificial neural network learns the dynamics between roads building up the traffic system. The scalability of their approach enables modelling a complex structure of the traffic with a single model. Although GNNs have been around for several years, only today have they reached the maturity suitable for solving realistically complex problems, due to both algorithmic progress and new GPU-optimized implementations.

Forecasting any given parameter in the complex dynamics of traffic can be considered as a spatio-temporal problem. While spatial relations between road and road sections can be modelled with graph structure, the ways to model the temporal aspect can vary. Xie et al (Xie et al., 2020) proposes a SeqGNN which combines sequence-to-sequence (Seq2Seq) models with GNNs. On the other hand, Song et al (Song et al., 2018) models the temporal dependency between the graphs by recurrent approach and Guo et al (Guo et al., 2019) adds an attention mechanism to control which weights need to be updated. Although RNN-based approaches seem to be a more popular choice for modelling the temporal aspects, Yu et al (Yu et al., 2018) propose a structure with several spatio-temporal convolutional blocks. The convolution structure has been defined on the time axis to model the temporal dependencies. Their goal is to exploit a simpler structure with fast training capabilities that can handle multi-scale traffic networks. Although a lot can be gained by modelling the roads and road connections as a graph, there is no guarantee that such a graph structure models the true underlying relationships between time-series. There might be a need for using graph embeddings (Chen et al., 2018), proximity embedding (Wang et al., 2016), and walk embedding (Grover and Leskovec, 2016) which can lower the dimensional complexity of the problem. A representation that is optimized to preserve both the proximity of nodes and the structure of the traversals is intuitively very appealing. In addition, the underlying graph might vary given different circumstances throughout time. Thus, Löwe et. al proposes to create amortized graphs (Lowe et al., 2020) that take advantage of the variation in the data.

3.3 Driver monitoring

With the introduction of more intelligent infrastructure and vehicles, one might conclude that the role that humans play will become less significant. However, some people might still want to drive themselves, which means that there will be a complex mixture of vehicles operating at different levels of autonomy (e.g. in terms of the levels described by SAE International (SAE, 2018)). Some potential dangers include ”risk compensation”–people engage in riskier behaviors because they think technology can deal with it–as well as lower driving ability due to less opportunities to drive. Support can target accident-prone times, such as when a vehicle transitions to a more manual mode. Moreover, when control is removed from a human driver, there is a responsibility to ensure that humans in a vehicle are comfortable and safe. Thus, an important task includes detecting the state of people within a vehicle, to avoid negative states and seek to achieve positive states. Negative states can involve sleepiness, distraction, drunkenness, health problems (e.g. epilepsy), and negative affect (anger, fear, and embarrassment), as well as individual predilections (some drivers can prefer to drive more wildly or be unsure how to interpret some driving situations due to lack of experience) (Yang et al., 2020). Positive states can involve comfort and enjoyment (Beggiato et al., 2020). To infer a human’s state, some typical features detected include eye analysis (e.g. eye aspect ratio and blinking), as well as gaze, head pose, and posture.

In our previous work, we have explored the idea of using recurrent neural networks (Torstensson et al., 2019) to estimate the future behavior of a car driver. A video data set was collected describing typical (future) driver behavior/activities, e.g. driving safe, glancing, leaning, removing hand from wheel, reaching, grabbing, retracting, and holding. A classification network was trained to recognize the current activity, and the result is fed into a recurrent network to predict the activity in the next frame. The accuracy of predicting the activity in the next frame is 80%; nevertheless, the method is capable of predicting activity up to 20 frames ahead with an accuracy of 62% (video was captured at 30fps). Furthermore, we have explored how social media could be used to support drivers, by leveraging insight into how they feel outside of the vehicle and interacting to reduce loneliness–a construct which has been tied to factors that increase the risk of accidents, such as depression and sleep deprivation (Valle et al., 2021). Some open research challenges include how ’wisdom of the crowd’ strategies can be incorporated to find potential dangers and anomalous driving, as well as how to infer the ’meaning’ of detected emotions by detecting what they refer to: i.e. the ’emotional referent’. For example, if a passenger frowns, is this behavior a reaction to driving conditions, or to something on their cell phone?

Also, the need of continuous user authentication and monitoring is increasingly observable when larger fleet of professional vehicles are on the road and many drivers are brought under pressure to drive longer than what legislation allows. It can also aid for example to assess that an authorized person is driving a vehicle, or to detect driver drowsiness or distraction. Here, modalities captured with cameras (face (Guo and Zhang, 2019; Li and Deng, 2020) or eye regions (Alonso-Fernandez et al., 2018; Alonso-Fernandez and Bigun, 2016)) can be complemented with sensors attached to the seat or the steering wheel that allows to capture bio-signals such as heartbeats (Wartzek et al., 2011) or skin impedance (Macias et al., 2013), which correlates, for example, with sweating - stress level - but also with fitness levels (Jaffrin and Morel, 2008). There are also proof of concepts using Doppler radar for vital signs measurement (Li et al., 2013), which has the evident advantage of not needing any type of contact. Infrared thermal imaging is another possibility, derived from subcutaneous blood flow and perspiration patterns (Ioannou et al., 2014), while allows to mitigate privacy concerns that may arise from the use of regular cameras operating in the visible range. These solutions allow unobtrusive monitoring of human vital signs that goes beyond driver monitoring as well. While fatigue or distraction detection may seem the most straightforward task, other examples can include: () pre-crash road safety, since abnormal vital signs can reflect the presence of drugs, alcohol, stress, or even diseases such as pre-dementia (Nicolini et al., 2014); () post-crash road safety, because detected signals can be used by Advance Automatic Crash Notification (AACN) systems in order to improve alarm handling; or () person identity, which can be achieved in an unobtrusive way not only with traditional facial images, but also with bio-signals (Maiorana and Campisi, 2018).

Investments in driverless cars are already massive, both in the public and private sector. However, their benefits will be several orders of magnitudes higher if they are used collectively (taxis, buses, trucks feeding trains, etc.) reducing the number of vehicles serving transportation needs, producing higher time-gains and reducing environmental footprint. One bottle-neck will be then to secure the use of such vehicles when people who do not know each other travel together with more confidence and safety when there is no driver, and even without tickets, nor identity cards, since the identity and rights management can be done entirely by the system. In continuous biometrics, users are constantly monitored without needing active cooperation, in contrast to one-time authentication, e.g. at the beginning of a session. This may be done by using all pieces of biometric information available at a particular time, including soft-biometrics, behavior, or emotional state (Jain et al., 2016). Accumulated evidence over time also can be used to improve accuracy. In this context, active modalities, such as fingerprints and iris, are often stronger than the weaker passive modalities (e.g. face), but the latter demand no cooperation. Additionally, certain intentions, expressions and physical states are noticeable in the footprint left only in continuous visual signals of face and body, e.g. drowsiness, stress, or irregular behavior. In these scenarios, localization of body parts and handheld objects are also important for safety and comfort, in order to detect potentially dangerous people or events. For example, holding a book, a steering wheel or a weapon makes a big difference, as well as to who the hands belong or what the hands are engaged in.

In addition, for a vehicle with automated functions, knowing the driver to provide a pleasant experience is as important as knowing the actions and intentions of the surrounding road-users. In (Varytimidis et al., 2018) we developed AI-based algorithms that could detect the action and intentions of pedestrians. Such information is valuable while building reliable ADAS functions.

4 Open research challenges and standardization

The previous sections have outlined current research initiatives and state of the art within perception, traffic system control, and driver monitoring. We now explore the viable future research directions based on the previous work. Perception, from an ego vehicle is typically limited by the field of view of the available sensors which thus calls for reliable communication between road users and possible also infrastructure to improve local awareness and thereby allow for improved tactical and operational vehicle control. Standards such as Cooperative ITS (Chen and Englund, 2014) are put forward to ease collaboration in the traffic system. However, the current standard promotes simple here I am messages and does not allow negotiation between road users to improve traffic safety and traffic flow. Future research should consider data formats and ontologies to enable interaction between not only vehicles and between vehicles and infrastructure but also between vehicles and VRUs. In addition, ontologies can be used to harmonize traffic behavior as described in (Englund et al., 2013). Another challenge is the ownership of the intelligent infrastructure, the shared data and how to manage revenue. Consequently, to utilize the full potential of the technology, IoT and ICT, sharing of aggregated information such as trajectories or behavior should be enabled to further build local awareness and thus facilitate tactical decision-making in traffic.

Another area of future research is the security of the AI-based systems. Research should obstruct risks with sensor vulnerabilities. In e.g. (Cuthbertsson, 2020) the author highlights the risk of hijacking a vehicle with the help of manipulated billboards. Risks with hacking (Greenberg, 2015)

is also prevalent and could remedy from time critical AI-based anomaly detection methods. Such methods would ensure safe and secure tactical and operational road vehicle automation.

As sensor technologies develop, and computing power increases the use of autonomous drones, both aerial and with wheels, will increase. With full anti-collision capabilities they will ease our lives with instant delivery, guiding, carrying and surveillance. Such AI technology would need capabilities in all three domains, strategical, tactical and operational to be efficient.

Another field that will evolve is how we interact with the technology. As mentioned in the introduction there have been attempts with light-based external HMI to ease the introduction of platoons in traffic, to let the surrounding traffic understand the intentions of the platooning vehicles. With higher level of automation, where the vehicles completely handle tactical and operational tasks, new ways of interacting with the vehicles are necessary. AI will play a major role to enable interaction with automated vehicles through gestures and speech. The vehicles can observe the current state of the driver or operator and thereby enable natural interaction and control in all three levels of automation, strategical, tactical and operational.

With improved sensing and interaction, the introduction of unmanned vehicles will be expanded, both ground and aerial vehicles will become more and more common (Ortiz et al., 2018).

Pervasive intelligence is another concept that can have great impact on both business and society. While systems become more and more digitized and start interacting more with other systems (trucks in a platoon or different administrations within a municipality or vehicles in a multi-modal transportation service system for example), there is always a risk of sub-optimization since the learning functions does not have access to the whole set of data. Consequently, future research should focus on how to optimize behavior on a larger scale, allowing the system to access data also outside of the own system boundaries.

One challenge of applying machine learning in vehicles is their rigorous safety requirements. The traditional standard ISO26262 

(ISO, 2018b) does not apply to a trained machine learning-based software since the behavior of a machine learning-based software is not explicitly expressed in source code and it does not follow a certain specification. The developers rather define an algorithm and an architecture that learns the functionality. In the process enormous amounts of data complemented with domain-specific labels are used to teach the machine to capture the relationships between input and output data. For the development of road vehicle automation, this step usually concerns the collection and preprocessing of huge amount of data from e.g. camera, LiDAR, and radar sensors along with training and evaluation using even more data.

In January 2019, ISO/PAS 21448 - Safety of the Intended Functionality (SOTIF) was published containing guidance on the applicable design, verification and validation measures that are needed to achieve the SOTIF (ISO, 2019). A PAS (Publicly Available Specification) is not an established standard, but rather a document that resembles the content of what is planned to be included in a future standard.

It is the intention that ISO 26262 and SOTIF should be complementary standards: ISO 26262 covers “absence of unreasonable risk due to hazards caused by malfunctioning behavior” (ISO, 2018b) by mandating rigorous development and is structured under the V-model way-of-working. The focus of SOTIF is to addresses “hazards resulting from functional inefficiencies of the intended functionality”  (ISO, 2019), e.g. classification failures in an automotive local awareness system, which is different from the type of malfunctions targeted by defect-oriented ISO 26262.

SOTIF is not structured based on the V-model but around () known safe states, () known unsafe states, and () unknown unsafe states. Note that SOTIF concerns the process of minimizing the two unsafe states, by focusing on detailing the requirements specifications for the developed functionality, where the aim is to shift the hazards from () () and from () () which in turn are derived from hazard identification and hazard mitigation, respectively.

Another challenge is harmonizing the introduction of more advanced vehicular functions and making them socially accepted. ADAS such as Forward vehicle collision mitigation systems (FVCMS), Pedestrian detection and collision mitigation systems (PDCMS) and Bicyclist detection and collision mitigation systems (BDCMS) are examples of vehicle functions that make use of on-board sensors that build local awareness around vehicles. These systems are described and specified by ISO, International Organization for Standardization, i.e. FVCMS by ISO 22839:2013 (ISO, 2013), PDCMS by ISO 19237:2017 (ISO, 2017) and BDCMS by ISO 22078:2020 (ISO, 2020). These standards detail concepts of functionality, their minimum functionality and systems’ requirements in conjunction with interfaces and how testing of the functions should be performed. Recently, standards on external Human-Machine Interfaces (eHMI) have been put forward. The functionality is described in the technical report Road Vehicles — Ergonomic aspects of external visual communication from automated vehicles to other road users, ISO/TR 23049 (ISO, 2018a). The document describes how automated vehicles should communicate their intentions to surrounding road users. Enabling such functionality, the vehicles need to be able to interpret the behavior of their fellow road users.

5 Summary and Conclusion

SCC refers to a cohesive concept to develop a sustainable future society. This paper highlights how AI can support the development of SCC and in particular within the future cooperative ITS. The two main challenges within ITS that are addressed by SCC are the traffic safety and the environmental challenge. In Europe, reducing the number of severely injured by half by 2030 from 2020 years level and the ultimate goal to have close to zero deaths in 2050 are two goals set by the European Commission (Commission, 2019b).


using computer vision and sensing is one of the enablers of road vehicle automation. Sensor fusion is typically used to obtain a robust mapping of the surrounding. For the vehicle to understand, and interpret the surrounding, it uses semantic mapping that makes the vehicle aware. Another way to interpret the surrounding that is highlighted is to use classification and tracking to predict the behavior of surrounding road users.

Traffic system control is central in managing traffic flow in larger cities. To improve traffic system control, traffic simulators are often used. One can use simulators to do planning of cities or new road infrastructure as well as using them to predict future scenarios or the effect of a presumptive maintenance effort. Challenges include how to incorporate the effect of future automated vehicles, since their behavior is so far unknown.

Driver monitoring is a research field that improves the user experience in vehicle automation. In ADAS the driver monitoring system helps to warn the driver if he/she becomes distracted or drowsy. The driver monitoring system also plays an important role in automated vehicles. The driver monitoring system should then know about the driver to be able to hand over control only when the driver is capable of driving. In-vehicle, sensors may also be used for authentication. For this purpose, camera sensors can be fused with other modalities, e.g. in the steering wheel or the seat, that can capture pulse or skin impedance. Doppler radar sensors are another example of system that with the help of AI can be used to estimate vital parameters of a driver, i.e. to improve handover or warning applications.

As reported in this paper, AI-based technology has achieved many great things and promises huge benefits in general concerning economic growth, social development, as well as quality of life and in particular reducing the environmental impact and improving traffic safety.

Most of the AI-based technologies that are mentioned in this paper are data intensive. Not only the volume and frequency of the data is important, but also there is a need to merge a variety of data from different types and sources. Data from service providers, municipalities, traffic authorities, and user data are required to create a holistic view of the situation on a scale of a city. Collecting and analyzing such data comes with privacy challenges such as a trade-off between preserving the rights of the road users and providing personalized services. The possibilities of decentralized analysis of data or aggregation of intermediate results could also be considered.

The common challenge in real world applications of AI is how to trust the system–since the behavior of an AI is not built by explicitly expressed source code and does not follow a certain specification but is rather built by algorithms that learn the intended functionality based on historical data. The generalizability of the algorithms in a real-world setting can be measured when they are deployed in practice and they are forced to encounter corner cases that have not been encounter before.

However, one of the main challenges is how can we set up the requirements of a system that is based on historical data. In road vehicle automation, where safety is imperative, verification and validation are crucial to benefit from the generalizability of the AI technology. Even if the training data are annotated and contain labels of the objects in the scenes, what are the guarantees that no new objects, different from the ones in the training set, will appear in future traffic situations? In addition, what is the relationship between requirements and the level of detail of the data annotation?

In addition, the low-level explainability of AI models, as well as data biases and data privacy pose considerable risks for users, developers, humanity, and societies. Consequently, although AI can predict, model and sense, AI technology is not yet capable of explaining how it comes to certain conclusions and therefore, it is difficult for humans to fully trust it.

All in all, enabling technologies for smart transport has come a long way in recent years. Now, it is time to start to connect these technological building blocks to unlock socio-economic benefits of smart cities. This step is going to be challenging and it needs a collaboration of several actors from end-users, SMEs, Original Equipment Manufacturer (OEMs), city responsible, legislators to implement existing technologies in practice.


  • EU- (2021a) (2021a). Eu initiative CIVITAS. Accessed: 2021-04-21.
  • Cit (2021) (2021). Eu project CITYkeys. Accessed: 2021-04-21.
  • EU- (2021b) (2021b). Eu smart cities marketplace. Accessed: 2021-04-21.
  • EU2 (2021) (2021). European union 2020. Accessed: 2021-04-21.
  • gdp (2021) (2021). GDP in us between 1947-2009. Accessed: 2021-04-21.
  • EuC (2021) (2021). State of the union: Commission raises climate ambition. Accessed: 2021-04-21.
  • dri (2021) (2021). Strategic innovation program, drive sweden. Accessed: 2021-04-21.
  • inf (2021) (2021). Strategic innovation program, infra sweden. Accessed: 2021-04-21.
  • Dri (2021) (2021). Strategic innovation program newsletter, drive sweden. Accessed: 2021-04-21.
  • Sma (2021) (2021). Strategic innovation program, smart city sweden. Accessed: 2021-04-21.
  • Via (2021) (2021). Strategic innovation program, viable cities. Accessed: 2021-04-21.
  • Xpl (2021) (2021). Xplorion - residential mobility service in car-free accommodation. Accessed: 2021-04-21.
  • Agency (2019) Agency, E. E. (2019). Final energy consumption in Europe by mode of transport . Technical report, European Environment Agency.
  • Aksoy et al. (2019) Aksoy, E. E., S. Baci, and S. Cavdar (2019). Salsanet: Fast road and vehicle segmentation in lidar point clouds for autonomous driving. In 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 926–932.
  • Alonso-Fernandez and Bigun (2016) Alonso-Fernandez, F. and J. Bigun (2016). A survey on periocular biometrics research. Pattern Recognition Letters 82, 92–105.
  • Alonso-Fernandez et al. (2018) Alonso-Fernandez, F., J. Bigun, and C. Englund (2018, Nov). Expression recognition using the periocular region: A feasibility study. In 2018 14th International Conference on Signal-Image Technology Internet-Based Systems (SITIS), pp. 536–541.
  • Alvear et al. (2018) Alvear, O., C. T. Calafate, J.-C. Cano, and P. Manzoni (2018). Crowdsensing in smart cities: Overview, platforms, and environment sensing issues. Sensors 18(2), 460.
  • Aramrattana et al. (2015) Aramrattana, M., T. Larsson, J. Jansson, and C. Englund (2015). Dimensions of Cooperative Driving, ITS and Automation. In Intelligent Vehicles Symposium (IV), Volume 2015-August, Seoul, pp. 144–149. IEEE.
  • Arasteh et al. (2016) Arasteh, H., V. Hosseinnezhad, V. Loia, A. Tommasetti, O. Troisi, M. Shafie-khah, and P. Siano (2016). Iot-based smart cities: a survey. In 2016 IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC), pp. 1–6. IEEE.
  • Barrachina et al. (2014) Barrachina, J., P. Garrido, M. Fogue, F. J. Martinez, J.-C. Cano, C. T. Calafate, and P. Manzoni (2014). Reducing emergency services arrival time by using vehicular communications and evolution strategies. Expert Systems with Applications 41(4), 1206–1217.
  • Beggiato et al. (2020) Beggiato, M., N. Rauh, and J. Krems (2020). Facial expressions as indicator for discomfort in automated driving. In International Conference on Intelligent Human Systems Integration, pp. 932–937. Springer.
  • Belli et al. (2020) Belli, L., A. Cilfone, L. Davoli, G. Ferrari, P. Adorni, F. Di Nocera, A. Dall’Olio, C. Pellegrini, M. Mordacci, and E. Bertolotti (2020). Iot-enabled smart sustainable cities: Challenges and approaches. Smart Cities 3(3), 1039–1071.
  • Bishop (1995) Bishop, M. C. (1995). Neural Networks for Pattern Recognition., Volume 92. Oxford University Press, Inc.
  • Bokare and Maurya (2016) Bokare, P. S. and A. K. Maurya (2016, July). Acceleration-deceleration behaviour of various vehicle types. In World Conference on Transport Research, Shanghai, China.
  • Breiman (2001) Breiman, L. (2001, oct). Random Forests. Machine Learning 45(1), 5–32.
  • Byttner et al. (2011) Byttner, S., T. Rögnvaldsson, and M. Svensson (2011). Consensus self-organized models for fault detection (cosmo). Engineering applications of artificial intelligence 24(5), 833–839.
  • Chen et al. (2018) Chen, H., B. Perozzi, Y. Hu, and S. Skiena (2018). Harp: Hierarchical representation learning for networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Volume 32.
  • Chen and Englund (2014) Chen, L. and C. Englund (2014). Cooperative ITS - EU standards to accelerate cooperative mobility. In The 3rd International Conference on Connected Vehicles & Expo (ICCVE 2014), pp. 681–686.
  • Chen and Englund (2016) Chen, L. and C. Englund (2016). Cooperative Intersection Management: A Survey. IEEE Transactions on Intelligent Transportation Systems 17(2), 570–586.
  • Chen et al. (2018) Chen, L., G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 834–848.
  • Chen et al. (2018) Chen, L.-C., Y. Zhu, G. Papandreou, F. Schroff, and H. Adam (2018, September). Encoder-decoder with atrous separable convolution for semantic image segmentation. In The European Conference on Computer Vision (ECCV).
  • Chen et al. (2017) Chen, X., H. Ma, J. Wan, B. Li, and T. Xia (2017). Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915.
  • Christopher (2009) Christopher, T. (2009). Analysis of Dynamic Scenes: Application to Driving Assistance. Ph. D. thesis, Institut Polytechnique De Grenoble, France.
  • Commission (2014) Commission, E. (2014). 2030 climate and energy goals for a competitive, secure and low-carbon EU economy.
  • Commission (2018) Commission, E. (2018). Annual Accident Report 2018. Technical report, European Commission, Directorate General for Transport.
  • Commission (2019a) Commission, E. (2019a). Data table – number of road deaths and rate per million population by country, 2010-2019. Technical report, CARE (Community Road Accident) database.
  • Commission (2019b) Commission, E. (2019b). EU Road Safety Policy Framework 2021-2030 - Next steps towards ”Vision Zero” .
  • Cortinhal et al. (2020) Cortinhal, T., G. E. Tzelepis, and E. E. Aksoy (2020). Salsanext: Fast, uncertainty-aware semantic segmentation of lidar point clouds. In International Symposium on Visual Computing, pp. 207–222.
  • Cuthbertsson (2020) Cuthbertsson, A. (2020). Hacked billboards could trick self-driving cars into suddenly stopping. Independent, Thursday 15 October.
  • Departmen (2020) Departmen, S. R. (2020). Distribution of gross domestic product (GDP) across economic sectors in the United States from 2000 to 2017.
  • DK. et al. (2018) DK., K., M. D., U. M., and S. S. (2018). Season-invariant semantic segmentation with a deep multimodal network. Field and Service Robotics.
  • Englund (2020a) Englund, C. (2020a). Action Intention Recognition of Cars and Bicycles in Intersections. (In Press) Int. J. Vehicle Design, Special Issue on: Safety and Standards for Connected and Autonomous Vehicles.
  • Englund (2020b) Englund, C. (2020b). Aware and Intelligent Infrastructure for Action Intention Recognition of Cars and Bicycles. In 6th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS), Prague, Czech Republic.
  • Englund et al. (2016) Englund, C., L. Chen, J. Ploeg, E. Semsar-Kazerooni, A. Voronov, H. H. Bengtsson, and J. Didoff (2016). The grand cooperative driving challenge 2016: boosting the introduction of cooperative automated vehicles. IEEE Wireless Communications 23(4), 146–152.
  • Englund et al. (2014a) Englund, C., L. Chen, and A. Voronov (2014a). Cooperative speed harmonization for efficient road utilization. In 2014 7th International Workshop on Communication Technologies for Vehicles (Nets4Cars-Fall), pp. 19–23. IEEE.
  • Englund et al. (2014b) Englund, C., L. Chen, and A. Voronov (2014b). Cooperative speed harmonization for efficient road utilization. In A. Vinel (Ed.), Nets4Cars, pp. 19–23.
  • Englund et al. (2013) Englund, C., K. Lidström, and J. Nilsson (2013, sep). On the need for standardized representations of cooperative vehicle behavior. In Second International Symposium on Future Active Safety Technology toward zero-traffic-accident, Nagoya, JAPAN, pp. 1–6. Society of Automotive Engineers of Japan, Inc.
  • Englund and Verikas (2008) Englund, C. and A. Verikas (2008). Ink feed control in a web-fed offset printing press. The International Journal of Advanced Manufacturing Technology 39(9-10), 919–930.
  • Eurostat (2019) Eurostat (2019). Shedding light on energy in the EU - A GUIDED TOUR OF ENERGY STATISTICS. Technical report, Eurostat.
  • Eurostat (2020) Eurostat (2020). Shedding light on energy in the EU - A GUIDED TOUR OF ENERGY STATISTICS. Technical report, Eurostat.
  • Feng et al. (2019) Feng, D., C. Haase-Schuetz, L. Rosenbaum, H. Hertlein, F. Duffhauss, C. Gläser, W. Wiesbeck, and K. Dietmayer (2019). Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. CoRR abs/1902.07830.
  • Feng et al. (2018) Feng, D., L. Rosenbaum, and K. Dietmayer (2018). Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3266–3273. IEEE.
  • Freund et al. (1999) Freund, Y., R. Schapire, and N. Abe (1999). A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence 14(771-780), 1612.
  • Gal and Ghahramani (2016) Gal, Y. and Z. Ghahramani (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pp. 1050–1059.
  • Garcia et al. (2017) Garcia, F., D. Martin, A. De La Escalera, and J. M. Armingol (2017). Sensor fusion methodology for vehicle detection. IEEE Intelligent Transportation Systems Magazine 9(1), 123–133.
  • Gärling and Schuitema (2007) Gärling, T. and G. Schuitema (2007). Travel demand management targeting reduced private car use: effectiveness, public acceptability and political feasibility. Journal of Social Issues 63(1), 139–153.
  • Gers et al. (2000) Gers, F. A., J. Schmidhuber, and F. Cummins (2000). Learning to forget: Continual prediction with lstm. Neural Computation 12.
  • Girshick et al. (2014) Girshick, R., J. Donahue, T. Darrell, and J. Malik (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587.
  • Girshick et al. (2015) Girshick, R., F. Iandola, T. Darrell, and J. Malik (2015).

    Deformable part models are convolutional neural networks.

    In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 437–446.
  • Greenberg (2015) Greenberg, A. (2015). Hackers remotely kill a jeep on the highway—with me in it. WIRED, 07.21.2015.
  • Grover and Leskovec (2016) Grover, A. and J. Leskovec (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, NY, USA, pp. 855–864. Association for Computing Machinery.
  • Guo and Zhang (2019) Guo, G. and N. Zhang (2019).

    A survey on deep learning based face recognition.

    Computer Vision and Image Understanding 189, 102805.
  • Guo et al. (2019) Guo, Y., H. Wang, Q. Hu, H. Liu, L. Liu, and M. Bennamoun (2019). Deep learning for 3d point clouds: A survey. CoRR.
  • Guo et al. (2019) Guo, Z., Y. Zhang, and W. Lu (2019, July). Attention guided graph convolutional networks for relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 241–251. Association for Computational Linguistics.
  • Hazirbas et al. (2016) Hazirbas, C., L. Ma, C. Domokos, and D. Cremers. (2016). Fusenet: Incorporating depth into semantic segmentation via fusionbased cnn architecture. In ACCV.
  • He et al. (2017) He, K., G. Gkioxari, P. Dollár, and R. Girshick (2017). Mask R-CNN. In Proceedings of the IEEE international conference on computer vision, pp. 2961–2969.
  • Ilg et al. (2018) Ilg, E., O. Cicek, S. Galesso, A. Klein, O. Makansi, F. Hutter, and T. Brox (2018). Uncertainty estimates and multi-hypotheses networks for optical flow. In ECCV, pp. 652–667.
  • Ioannou et al. (2014) Ioannou, S., V. Gallese, and A. Merla (2014). Thermal infrared imaging in psychophysiology: potentialities and limits. Psychophysiology 51.
  • ISO (2013) ISO (2013, Jun). Intelligent transport systems — Forward vehicle collision mitigation systems — Operation, performance, and verification requirements. Available in electronic form for online purchase at:
  • ISO (2017) ISO (2017, Dec). Intelligent transport systems — Pedestrian detection and collision mitigation systems (PDCMS) — Performance requirements and test procedures. Available in electronic form for online purchase at:
  • ISO (2018a) ISO (2018a, Sep). Road Vehicles — Ergonomic aspects of external visual communication from automated vehicles to other road users. Available in electronic form for online purchase at:
  • ISO (2018b) ISO (2018b, Dec). Road vehicles — Functional safety — Part 1: Vocabulary. Available in electronic form for online purchase at:
  • ISO (2019) ISO (2019, Jan). Road vehicles — Safety of the intended functionality. Available in electronic form for online purchase at:
  • ISO (2020) ISO (2020, Feb). Intelligent transport systems — Bicyclist detection and collision mitigation systems (BDCMS) — Performance requirements and test procedures. Available in electronic form for online purchase at:
  • Jaffrin and Morel (2008) Jaffrin, M. Y. and H. Morel (2008). Body fluid volumes measurements by impedance: A review of bioimpedance spectroscopy (bis) and bioimpedance analysis (bia) methods. Medical engineering & physics 30.
  • Jain et al. (2016) Jain, A., K. Nandakumar, and A. Ross (2016, Aug). 50 years of biometric research: Accomplishments, challenges, and opportunities. Pattern Recognition Letters 79, 80–105.
  • Jazayeri et al. (2011) Jazayeri, A., H. Cai, J. Y. Zheng, and M. Tuceryan (2011). Vehicle detection and tracking in car video based on motion model. IEEE Transactions on Intelligent Transportation Systems 12(2), 583–595.
  • Joseph et al. (2011) Joseph, J., F. Doshi-Velez, A. S. Huang, and N. Roy (2011, Aug). A bayesian nonparametric approach to modeling motion patterns. Autonomous Robots 31(4), 383.
  • Kendall et al. (2015) Kendall, A., V. Badrinarayanan, and R. Cipolla (2015). Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv preprint arXiv:1511.02680.
  • Kheterpal et al. (2018) Kheterpal, N., K. Parvate, C. Wu, A. Kreidieh, E. Vinitsky, and A. Bayen (2018). Flow: Deep reinforcement learning for control in sumo. EPiC Series in Engineering 2, 134–151.
  • Kianfar et al. (2012) Kianfar, R., B. Augusto, A. Ebadighajari, U. Hakeem, J. Nilsson, A. Raza, R. S. Tabar, N. V. Irukulapati, C. Englund, P. Falcone, et al. (2012). Design and experimental validation of a cooperative driving system in the grand cooperative driving challenge. IEEE Transactions on Intelligent Transportation Systems 13(3), 994–1007.
  • Kocić et al. (2018) Kocić, J., N. Jovičić, and V. Drndarević (2018). Sensors and sensor fusion in autonomous vehicles. In 2018 26th Telecommunications Forum (TELFOR), pp. 420–425. IEEE.
  • Krajzewicz et al. (2012) Krajzewicz, D., J. Erdmann, M. Behrisch, and L. Bieker (2012). Recent development and applications of sumo-simulation of urban mobility. International Journal On Advances in Systems and Measurements 5(3&4).
  • Krajzewicz et al. (2002) Krajzewicz, D., G. Hertkorn, C. Feld, and P. Wagner (2002, 01). Sumo (simulation of urban mobility); an open-source traffic simulation. In 4th Middle East Symposium on Simulation and Modelling (MESM2002), pp. 183–187.
  • Kucner et al. (2013) Kucner, T., J. Saarinen, M. Magnusson, and A. J. Lilienthal (2013, November). Conditional transition maps: learning motion patterns in dynamic environments. In IEEE International Conference on Intelligent Robots and Systems, Tokyo, Japan.
  • Landrieu and Simonovsky (2018) Landrieu, L. and M. Simonovsky (2018). Large-scale point cloud semantic segmentation with superpoint graphs. In CVPR.
  • Lange and Perez (2020) Lange, O. and L. Perez (2020). Traffic prediction with advanced graph neural networks.
  • Law and Deng (2018) Law, H. and J. Deng (2018). Cornernet: Detecting objects as paired keypoints. In ECCV.
  • Lefevre et al. (2013) Lefevre, S., C. Laugier, and J. Ibanez-Guzman (2013). Intention-aware risk estimation for general traffc situations, and application to intersection safety. Research report, Inria.
  • Lefevre et al. (2014) Lefevre, S., D. Vasquez, and C. Laugier (2014). A survey on motion prediction and risk assessment for intelligent vehicles. ROBOMECH Journal 1(1).
  • Li et al. (2013) Li, C., V. M. Lubecke, O. Boric-Lubecke, and J. Lin (2013). A review on recent advances in doppler radar sensors for noncontact healthcare monitoring. IEEE Transactions on Microwave Theory and Techniques 61(5), 2046–2060.
  • Li et al. (2013) Li, Q., L. Chen, M. Li, S.-L. Shaw, and A. Nüchter (2013). A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios. IEEE Transactions on Vehicular Technology 63(2), 540–555.
  • Li and Deng (2020) Li, S. and W. Deng (2020). Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 1–1.
  • Li et al. (2018) Li, S., W. Wang, Z. Mo, and D. Zhao (2018, June). Cluster naturalistic driving encounters using deep unsupervised learning. Arxiv:1802.10214.
  • Liang* et al. (2019) Liang*, M., B. Yang*, Y. Chen, R. Hu, and R. Urtasun (2019). Multi-task multi-sensor fusion for 3d object detection. In CVPR.
  • Lidström and Larsson (2009) Lidström, K. and T. Larsson (2009). Act normal: using uncertainty about driver intentions as a warning criterion. In 16th World Congress on Intelligent Transportation Systems (ITS WC), 21-25 September, 2009, Stockholm, Sweden, pp.  8.
  • Lidstrom et al. (2012) Lidstrom, K., K. Sjoberg, U. Holmberg, J. Andersson, F. Bergh, M. Bjade, and S. Mak (2012). A modular cacc system integration and design. IEEE Transactions on Intelligent Transportation Systems 13(3), 1050–1061.
  • Liu et al. (2020) Liu, L., W. Ouyang, X. Wang, P. W. Fieguth, J. Chen, X. Liu, and M. Pietikäinen (2020). Deep learning for generic object detection: A survey. International Journal of Computer Vision volume 128.
  • Liu et al. (2016) Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C. Fu, and A. C. Berg (2016). SSD: single shot multibox detector. In ECCV.
  • Lowe et al. (2020) Lowe, S., D. Madras, R. Zemel, and M. Welling (2020). Amortized causal discovery: Learning to infer causal graphs from time-series data.
  • Macias et al. (2013) Macias, R., M. A. García, J. Ramos, R. Bragos, and M. Fernández (2013). Ventilation and heart rate monitoring in drivers using a contactless electrical bioimpedance system. Journal of Physics: Conference Series 434.
  • Magavi (2020) Magavi, S. A. (2020). Behaviour modelling of vehicles at a roundabout. Master’s thesis, Halmstad University, Halmstad Embedded and Intelligent Systems Research (EIS).
  • Maiorana and Campisi (2018) Maiorana, E. and P. Campisi (2018). Longitudinal evaluation of eeg-based biometric recognition. IEEE Transactions on Information Forensics and Security 13(5), 1123–1138.
  • Maurya and Bokare (2012) Maurya, A. K. and P. S. Bokare (2012). Study of deceleration behaviour of different vehicle types. International Journal for Traffic and Transport Engineering 2(3), 253–270.
  • Mees et al. (2017) Mees, O., A. Eitel, and W. Burgard (2017). Choosing smartly: Adaptive multimodal fusion for object detection in changing environments. CoRR abs/1707.05733.
  • Milioto et al. (2019) Milioto, A., I. Vizzo, J. Behley, and C. Stachniss (2019). RangeNet++: Fast and Accurate LiDAR Semantic Segmentation. In IROS.
  • Muffert et al. (2012) Muffert, M., T. Milbich, D. Pfeiffer, and U. Franke (2012, June). May i enter the roundabout? a time-to-contact computation based on stereo-vision. In Intelligent Vehicles Symposium, Spain.
  • Muhammad and Åstrand (2018) Muhammad, N. and B. Åstrand (2018). Intention estimation using set of reference trajectories as behaviour model. Sensors 18(12), 4423.
  • Muhammad and Åstrand (2019) Muhammad, N. and B. Åstrand (2019). Predicting agent behaviour and state for applications in a roundabout-scenario autonomous driving. Sensors 19(19).
  • Nicolini et al. (2014) Nicolini, P., M. M. Ciulla, G. Malfatto, C. Abbate, D. Mari, P. D. Rossi, E. Pettenuzzo, F. Magrini, D. Consonni, and F. Lombardi (2014). Autonomic dysfunction in mild cognitive impairment: evidence from power spectral analysis of heart rate variability in a cross-sectional case-control study. PloS one 9.
  • Niska and Eriksson (2013) Niska, A. and J. Eriksson (2013). Statistik över cyklisters olyckor: faktaunderlag till gemensam strategi för säker cykling. vti.
  • Organization (2019) Organization, W. H. (2019). GLOBAL STATUS REPORT ON ROAD SAFETY. Technical report, World Health Organization.
  • Ortiz et al. (2018) Ortiz, S., C. T. Calafate, J.-C. Cano, P. Manzoni, and C. K. Toh (2018). A uav-based content delivery architecture for rural areas and future smart cities. IEEE Internet Computing 23(1), 29–36.
  • Poudel et al. (2019) Poudel, R. P. K., S. Liwicki, and R. Cipolla (2019). Fast-scnn: Fast semantic segmentation network. CoRR abs/1902.04502.
  • Qi et al. (2017) Qi, C. R., W. Liu, C. Wu, H. Su, and L. J. Guibas (2017). Frustum pointnets for 3d object detection from RGB-D data. CoRR.
  • Qi et al. (2017) Qi, C. R., H. Su, K. Mo, and L. J. Guibas (2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
  • Qi et al. (2018) Qi, C. R., L. Yi, H. Su, and L. J. Guibas (2018). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In In Proc. of the Advances in Neural Information Processing Systems (NIPS).
  • Redmon et al. (2016) Redmon, J., S. K. Divvala, R. B. Girshick, and A. Farhadi (2016). You only look once: Unified, real-time object detection. In CVPR.
  • Ren et al. (2016) Ren, S., K. He, R. Girshick, and J. Sun (2016). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6), 1137–1149.
  • Rudenko et al. (2019) Rudenko, A., L. Palmieri, M. Herman, K. M. Kitani, D. M. Gavrila, and K. O. Arras (2019). Human motion trajectory prediction: A survey. arXiv:1905.06113 [cs.RO].
  • SAE (2018) SAE (2018). J3016 levels of driving automation. Technical report, SAE International.
  • Scuotto et al. (2016) Scuotto, V., A. Ferraris, S. Bresciani, M. Al-Mashari, and M. Del Giudice (2016). Internet of things: applications and challenges in smart cities. a case study of ibm smart city projects. Business Process Management Journal.
  • Sivaraman and Trivedi (2013) Sivaraman, S. and M. M. Trivedi (2013). Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis. IEEE transactions on intelligent transportation systems 14(4), 1773–1795.
  • Song et al. (2018) Song, L., Y. Zhang, Z. Wang, and D. Gildea (2018). N-ary relation extraction using graph-state LSTM. In

    Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

    , Brussels, Belgium, pp. 2226–2235. Association for Computational Linguistics.
  • Szegedy et al. (2013) Szegedy, C., A. Toshev, and D. Erhan (2013). Deep neural networks for object detection. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, Red Hook, NY, USA, pp. 2553–2561. Curran Associates Inc.
  • Takumi et al. (2017) Takumi, K., K. Watanabe, Q. Ha, A. Tejero-De-Pablos, Y. Ushiku, and T. Harada (2017). Multispectral object detection for autonomous vehicles. In Proceedings of the on Thematic Workshops of ACM Multimedia 2017, Thematic Workshops ’17, New York, NY, USA, pp. 35–43. Association for Computing Machinery.
  • Tavares de Araujo Cesariny Calafate et al. (2016) Tavares de Araujo Cesariny Calafate, C. M., C. Wu, E. Natalizio, and F. J. Martínez (2016). Crowdsensing and vehicle-based sensing. Mobile Information Systems 2016.
  • Torstensson et al. (2019) Torstensson, M., B. Duran, and C. Englund (2019). Using recurrent neural networks for action and intention recognition of car drivers. In 8th International Conference on Pattern Recognition Applications and Methods, pp. 232–242.
  • Tukker (2015) Tukker, A. (2015). Product services for a resource-efficient and circular economy–a review. Journal of cleaner production 97, 76–91.
  • Valada et al. (2019) Valada, A., R. Mohan, and W. Burgard (2019). Self-supervised model adaptation for multimodal semantic segmentations. International Journal of Computer Vision.
  • Valle et al. (2021) Valle, F., A. Galozy, A. Ashfaq, F. Etminani, A. Vinel, and M. Cooney (2021). Lonely road: speculative challenges for a social media robot aimed to reduce driver loneliness. In MAISON2021, pp.  6.
  • Vapnik (1998) Vapnik, V. N. (1998). Statistical Learning Theory (Adaptive and Learning Systems for Signal Processing, Communications and Control Series). Wiley-Interscience.
  • Varytimidis et al. (2018) Varytimidis, D., F. Alonso-Fernandez, B. Duran, and C. Englund (2018). Action and intention recognition of pedestrians in urban traffic. In 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 676–682. IEEE.
  • Wang and Lien (2008) Wang, C.-C. R. and J.-J. J. Lien (2008). Automatic vehicle detection using local features—a statistical approach. IEEE Transactions on Intelligent Transportation Systems 9(1), 83–96.
  • Wang et al. (2016) Wang, D., P. Cui, and W. Zhu (2016). Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, NY, USA, pp. 1225–1234. Association for Computing Machinery.
  • Wang and Jia (2019) Wang, Z. and K. Jia (2019). Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection. CoRR abs/1903.01864.
  • Wartzek et al. (2011) Wartzek, T., B. Eilebrecht, J. Lem, H. Lindner, S. Leonhardt, and M. Walter (2011). Ecg on the road: Robust and unobtrusive estimation of heart rate. IEEE Transactions on Biomedical Engineering 58(11), 3112–3120.
  • Welch and Bishop (1995) Welch, G. and G. Bishop (1995). An introduction to the kalman filter.
  • World Health Organization (WHO) (2015) World Health Organization (WHO) (2015). Global Status Report on Road Safety 2015. Geneva: WHO Press.
  • WSDT (2019) WSDT (2019). Roundabout benefits – washington state department of transportation. Online at
  • Wu et al. (2018) Wu, B., A. Wan, X. Yue, and K. Keutzer (2018). Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. ICRA.
  • Wu et al. (2019) Wu, B., X. Zhou, S. Zhao, X. Yue, and K. Keutzer (2019). Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In ICRA.
  • Wu et al. (2017) Wu, C., A. Kreidieh, K. Parvate, E. Vinitsky, and A. M. Bayen (2017). Flow: Architecture and benchmarking for reinforcement learning in traffic control. arXiv preprint arXiv:1710.05465.
  • Xie et al. (2020) Xie, Z., W. Lv, S. Huang, Z. Lu, B. Du, and R. Huang (2020). Sequential graph neural network for urban road traffic speed prediction. IEEE Access 8, 63349–63358.
  • Xu et al. (2018) Xu, D., D. Anguelov, and A. Jain (2018, June). Pointfusion: Deep sensor fusion for 3d bounding box estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • Yang et al. (2020) Yang, D., X. Li, X. Dai, R. Zhang, L. Qi, W. Zhang, and Z. Jiang (2020). All in one network for driver attention monitoring. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1–6. York.
  • Yu et al. (2018) Yu, B., H. Yin, and Z. Zhu (2018, 7). Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 3634–3640. International Joint Conferences on Artificial Intelligence Organization.
  • Zhang et al. (2018) Zhang, C., W. Luo, and R. Urtasun (2018). Efficient convolutions for real-time semantic segmentation of 3d point clouds. In Proceedings of the International Conference on 3D Vision (3DV).
  • Zhao et al. (2017) Zhao, H., J. Shi, X. Qi, X. Wang, and J. Jia (2017, July). Pyramid scene parsing network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • Zhao et al. (2017) Zhao, M., D. Kathner, M. Jipp, D. Soffker, and K. Lemmer (2017, June). Modeling driver behavior at roundabouts: Results from a field study. In IEEE Intelligent Vehicles Symposium, Redondo Beach, CA, USA.
  • Zhao et al. (2017) Zhao, M., D. Kathner, D. Soffker, M. Jipp, and K. Lemmer (2017). Modeling driving behavior at roundabouts: Impact of roundabout layout and surrounding traffic on driving behavior. Workshop paper available at
  • Zhou and Tuzel (2018) Zhou, Y. and O. Tuzel (2018, June). Voxelnet: End-to-end learning for point cloud based 3d object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499.