Impact of Event Encoding and Dissimilarity Measures on Traffic Crash Characterization Based on Sequence of Events

by   Yu Song, et al.

Crash sequence analysis has been shown in prior studies to be useful for characterizing crashes and identifying safety countermeasures. Sequence analysis is highly domain-specific, but its various techniques have not been evaluated for adaptation to crash sequences. This paper evaluates the impact of encoding and dissimilarity measures on crash sequence analysis and clustering. Sequence data of interstate highway, single-vehicle crashes in the United States, from 2016-2018, were studied. Two encoding schemes and five optimal matching based dissimilarity measures were compared by evaluating the sequence clustering results. The five dissimilarity measures were categorized into two groups based on correlations between dissimilarity matrices. The optimal dissimilarity measure and encoding scheme were identified based on the agreements with a benchmark crash categorization. The transition-rate-based, localized optimal matching dissimilarity and consolidated encoding scheme had the highest agreement with the benchmark. Evaluation results indicate that the selection of dissimilarity measure and encoding scheme determines the results of sequence clustering and crash characterization. A dissimilarity measure that considers the relationships between events and domain context tends to perform well in crash sequence clustering. An encoding scheme that consolidates similar events naturally takes domain context into consideration.


page 14

page 15


Automated Vehicle Crash Sequences: Patterns and Potential Uses in Safety Testing

With safety being one of the primary motivations for developing automate...

Event Clustering Event Series Characterization on Expected Frequency

We present an efficient clustering algorithm applicable to one-dimension...

On Stricter Reachable Repetitiveness Measures*

The size b of the smallest bidirectional macro scheme, which is arguably...

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Clustering is an essential data mining tool that aims to discover inhere...

A fatal point concept and a low-sensitivity quantitative measure for traffic safety analytics

The variability of the clusters generated by clustering techniques in th...

Analysis of Geometric Selection of the Data-Error Covariance Inflation for ES-MDA

The ensemble smoother with multiple data assimilation (ES-MDA) is becomi...

Please sign up or login with your details

Forgot password? Click here to reset