Wayformer: Motion Forecasting via Simple Efficient Attention Networks

07/12/2022
by   Nigamaa Nayakanti, et al.
0

Motion forecasting for autonomous driving is a challenging task because complex driving scenarios result in a heterogeneous mix of static and dynamic inputs. It is an open problem how best to represent and fuse information about road geometry, lane connectivity, time-varying traffic light state, and history of a dynamic set of agents and their interactions into an effective encoding. To model this diverse set of input features, many approaches proposed to design an equally complex system with a diverse set of modality specific modules. This results in systems that are difficult to scale, extend, or tune in rigorous ways to trade off quality and efficiency. In this paper, we present Wayformer, a family of attention based architectures for motion forecasting that are simple and homogeneous. Wayformer offers a compact model description consisting of an attention based scene encoder and a decoder. In the scene encoder we study the choice of early, late and hierarchical fusion of the input modalities. For each fusion type we explore strategies to tradeoff efficiency and quality via factorized attention or latent query attention. We show that early fusion, despite its simplicity of construction, is not only modality agnostic but also achieves state-of-the-art results on both Waymo Open MotionDataset (WOMD) and Argoverse leaderboards, demonstrating the effectiveness of our design philosophy

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2021

Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction

Forecasting the future behavior of all traffic agents in the vicinity is...
research
06/15/2023

Motion Perceiver: Real-Time Occupancy Forecasting for Embedded Systems

This work introduces a flexible architecture for real-time occupancy for...
research
11/29/2021

MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction

Predicting the future behavior of road users is one of the most challeng...
research
03/08/2023

Dynamic Scenario Representation Learning for Motion Forecasting with Heterogeneous Graph Convolutional Recurrent Networks

Due to the complex and changing interactions in dynamic scenarios, motio...
research
08/30/2023

Adaptive Multi-Modalities Fusion in Sequential Recommendation Systems

In sequential recommendation, multi-modal information (e.g., text or ima...
research
12/07/2022

Towards Explainable Motion Prediction using Heterogeneous Graph Representations

Motion prediction systems aim to capture the future behavior of traffic ...
research
03/12/2020

MVLoc: Multimodal Variational Geometry-Aware Learning for Visual Localization

Recent learning-based research has achieved impressive results in the fi...

Please sign up or login with your details

Forgot password? Click here to reset