Predicting the Next Action by Modeling the Abstract Goal

09/12/2022
by   Debaditya Roy, et al.
0

The problem of anticipating human actions is an inherently uncertain one. However, we can reduce this uncertainty if we have a sense of the goal that the actor is trying to achieve. Here, we present an action anticipation model that leverages goal information for the purpose of reducing the uncertainty in future predictions. Since we do not possess goal information or the observed actions during inference, we resort to visual representation to encapsulate information about both actions and goals. Through this, we derive a novel concept called abstract goal which is conditioned on observed sequences of visual features for action anticipation. We design the abstract goal as a distribution whose parameters are estimated using a variational recurrent network. We sample multiple candidates for the next action and introduce a goal consistency measure to determine the best candidate that follows from the abstract goal. Our method obtains impressive results on the very challenging Epic-Kitchens55 (EK55), EK100, and EGTEA Gaze+ datasets. We obtain absolute improvements of +13.69, +11.24, and +5.19 for Top-1 verb, Top-1 noun, and Top-1 action anticipation accuracy respectively over prior state-of-the-art methods for seen kitchens (S1) of EK55. Similarly, we also obtain significant improvements in the unseen kitchens (S2) set for Top-1 verb (+10.75), noun (+5.84) and action (+2.87) anticipation. Similar trend is observed for EGTEA Gaze+ dataset, where absolute improvement of +9.9, +13.1 and +6.8 is obtained for noun, verb, and action anticipation. It is through the submission of this paper that our method is currently the new state-of-the-art for action anticipation in EK55 and EGTEA Gaze+ https://competitions.codalab.org/competitions/20071#results Code available at https://github.com/debadityaroy/Abstract_Goal

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2023

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Can we better anticipate an actor's future actions (e.g. mix eggs) by kn...
research
07/25/2022

Intention-Conditioned Long-Term Human Egocentric Action Forecasting @ EGO4D Challenge 2022

To anticipate how a human would act in the future, it is essential to un...
research
05/31/2020

In the Eye of the Beholder: Gaze and Actions in First Person Video

We address the task of jointly determining what a person is doing and wh...
research
03/25/2021

Stepwise Goal-Driven Networks for Trajectory Prediction

We propose to predict the future trajectories of observed agents (e.g., ...
research
11/22/2022

A Graph-Based Method for Soccer Action Spotting Using Unsupervised Player Classification

Action spotting in soccer videos is the task of identifying the specific...
research
02/11/2021

Neuromodulated attention and goal-driven perception in uncertain domains

In uncertain domains, the goals are often unknown and need to be predict...
research
12/16/2018

An Active Information Seeking Model for Goal-oriented Vision-and-Language Tasks

As Computer Vision algorithms move from passive analysis of pixels to ac...

Please sign up or login with your details

Forgot password? Click here to reset