Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

03/27/2017
by   Davide Moltisanti, et al.
0

Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10 observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network. We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4 of classes when Rubicon Boundaries are used for temporal annotations.

READ FULL TEXT

page 3

page 5

page 6

page 7

research
04/10/2019

H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions

We present a unified framework for understanding 3D hand and object inte...
research
11/21/2020

Boundary-sensitive Pre-training for Temporal Localization in Videos

Many video analysis tasks require temporal localization thus detection o...
research
03/24/2017

Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition

This work deviates from easy-to-define class boundaries for object inter...
research
09/11/2023

Temporal Action Localization with Enhanced Instant Discriminability

Temporal action detection (TAD) aims to detect all action boundaries and...
research
04/06/2023

Boundary-Denoising for Video Activity Localization

Video activity localization aims at understanding the semantic content i...
research
08/03/2022

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

Understanding human emotions is a crucial ability for intelligent robots...
research
11/16/2022

Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022

Capturing the state changes of interacting objects is a key technology f...

Please sign up or login with your details

Forgot password? Click here to reset