End-to-end Contextual Perception and Prediction with Interaction Transformer

08/13/2020
by   Lingyun Luke Li, et al.
20

In this paper, we tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving. Towards this goal, we design a novel approach that explicitly takes into account the interactions between actors. To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture, which we call the Interaction Transformer. Importantly, our model can be trained end-to-end, and runs in real-time. We validate our approach on two challenging real-world datasets: ATG4D and nuScenes. We show that our approach can outperform the state-of-the-art on both datasets. In particular, we significantly improve the social compliance between the estimated future trajectories, resulting in far fewer collisions between the predicted actors.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/29/2020

PnPNet: End-to-End Perception and Prediction with Tracking in the Loop

We tackle the problem of joint perception and motion forecasting in the ...
04/21/2022

Learning Future Object Prediction with a Spatiotemporal Detection Transformer

We explore future object prediction – a challenging problem where all ob...
01/27/2021

Spatial-Channel Transformer Network for Trajectory Prediction on the Traffic Scenes

Predicting motion of surrounding agents is critical to real-world applic...
02/25/2019

Stochastic Prediction of Multi-Agent Interactions from Partial Observations

We present a method that learns to integrate temporal information, from ...
04/01/2021

TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

Tracking multiple objects in videos relies on modeling the spatial-tempo...
09/20/2022

Traffic Accident Risk Forecasting using Contextual Vision Transformers

Recently, the problem of traffic accident risk forecasting has been gett...
12/22/2020

Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net

In this paper we propose a novel deep neural network that is able to joi...