Spatial-Temporal Deep Embedding for Vehicle Trajectory Reconstruction from High-Angle Video
Spatial-temporal Map (STMap)-based methods have shown great potential to process high-angle videos for vehicle trajectory reconstruction, which can meet the needs of various data-driven modeling and imitation learning applications. In this paper, we developed Spatial-Temporal Deep Embedding (STDE) model that imposes parity constraints at both pixel and instance levels to generate instance-aware embeddings for vehicle stripe segmentation on STMap. At pixel level, each pixel was encoded with its 8-neighbor pixels at different ranges, and this encoding is subsequently used to guide a neural network to learn the embedding mechanism. At the instance level, a discriminative loss function is designed to pull pixels belonging to the same instance closer and separate the mean value of different instances far apart in the embedding space. The output of the spatial-temporal affinity is then optimized by the mutex-watershed algorithm to obtain final clustering results. Based on segmentation metrics, our model outperformed five other baselines that have been used for STMap processing and shows robustness under the influence of shadows, static noises, and overlapping. The designed model is applied to process all public NGSIM US-101 videos to generate complete vehicle trajectories, indicating a good scalability and adaptability. Last but not least, the strengths of the scanline method with STDE and future directions were discussed. Code, STMap dataset and video trajectory are made publicly available in the online repository. GitHub Link: shorturl.at/jklT0.
READ FULL TEXT