How Much Does Audio Matter to Recognize Egocentric Object Interactions?

06/03/2019
by   Alejandro Cartas, et al.
5

Sounds are an important source of information on our daily interactions with objects. For instance, a significant amount of people can discern the temperature of water that it is being poured just by using the sense of hearing. However, only a few works have explored the use of audio for the classification of object interactions in conjunction with vision or as single modality. In this preliminary work, we propose an audio model for egocentric action recognition and explore its usefulness on the parts of the problem (noun, verb, and action classification). Our model achieves a competitive result in terms of verb classification (34.26 benchmark with respect to vision-based state of the art systems, using a comparatively lighter architecture.

READ FULL TEXT

page 2

page 3

page 4

research
10/15/2019

Seeing and Hearing Egocentric Actions: How Much Can We Learn?

Our interaction with the world is an inherently multimodal experience. H...
research
09/20/2019

Making the Invisible Visible: Action Recognition Through Walls and Occlusions

Understanding people's actions and interactions typically depends on see...
research
10/05/2017

A self-organizing neural network architecture for learning human-object interactions

The visual recognition of transitive actions comprising human-object int...
research
08/07/2023

ViLP: Knowledge Exploration using Vision, Language, and Pose Embeddings for Video Action Recognition

Video Action Recognition (VAR) is a challenging task due to its inherent...
research
06/27/2021

Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization

State of the art architectures for untrimmed video Temporal Action Local...
research
07/31/2018

Understanding human-human interactions: a survey

Many videos depict people, and it is their interactions that inform us o...
research
04/10/2019

Next-Active-Object prediction from Egocentric Videos

Although First Person Vision systems can sense the environment from the ...

Please sign up or login with your details

Forgot password? Click here to reset