Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting

by   Dooseop Choi, et al.

Variational autoencoder (VAE) has widely been utilized for modeling data distributions because it is theoretically elegant, easy to train, and has nice manifold representations. However, when applied to image reconstruction and synthesis tasks, VAE shows the limitation that the generated sample tends to be blurry. We observe that a similar problem, in which the generated trajectory is located between adjacent lanes, often arises in VAE-based trajectory forecasting models. To mitigate this problem, we introduce a hierarchical latent structure into the VAE-based forecasting model. Based on the assumption that the trajectory distribution can be approximated as a mixture of simple distributions (or modes), the low-level latent variable is employed to model each mode of the mixture and the high-level latent variable is employed to represent the weights for the modes. To model each mode accurately, we condition the low-level latent variable using two lane-level context vectors computed in novel ways, one corresponds to vehicle-lane interaction and the other to vehicle-vehicle interaction. The context vectors are also used to model the weights via the proposed mode selection network. To evaluate our forecasting model, we use two large-scale real-world datasets. Experimental results show that our model is not only capable of generating clear multi-modal trajectory distributions but also outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy. Our code is available at https://github.com/d1024choi/HLSTrajForecast.


page 1

page 2

page 3

page 4


MIDI-Sandwich2: RNN-based Hierarchical Multi-modal Fusion Generation VAE networks for multi-track symbolic music generation

Currently, almost all the multi-track music generation models use the Co...

Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling

Human activity modeling operates on two levels: high-level action modeli...

Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations

Clustering high-dimensional data, such as images or biological measureme...

Unified Brain MR-Ultrasound Synthesis using Multi-Modal Hierarchical Representations

We introduce MHVAE, a deep hierarchical variational auto-encoder (VAE) t...

Speech Modeling with a Hierarchical Transformer Dynamical VAE

The dynamical variational autoencoders (DVAEs) are a family of latent-va...

Multi-Vehicle Trajectories Generation for Vehicle-to-Vehicle Encounters

Generating multi-vehicle trajectories analogous to these in real world c...

TPPO: A Novel Trajectory Predictor with Pseudo Oracle

Forecasting pedestrian trajectories in dynamic scenes remains a critical...

Please sign up or login with your details

Forgot password? Click here to reset