Egocentric Activity Recognition with Multimodal Fisher Vector

01/25/2016
by   Sibo Song, et al.
0

With the increasing availability of wearable devices, research on egocentric activity recognition has received much attention recently. In this paper, we build a Multimodal Egocentric Activity dataset which includes egocentric videos and sensor data of 20 fine-grained and diverse activity categories. We present a novel strategy to extract temporal trajectory-like features from sensor data. We propose to apply the Fisher Kernel framework to fuse video and temporal enhanced sensor features. Experiment results show that with careful design of feature extraction and fusion algorithm, sensor data can enhance information-rich video data. We make publicly available the Multimodal Egocentric Activity dataset to facilitate future research.

READ FULL TEXT
research
04/29/2020

EmbraceNet for Activity: A Deep Multimodal Fusion Architecture for Activity Recognition

Human activity recognition using multiple sensors is a challenging but p...
research
06/13/2018

Human Activity Recognition Based on Wearable Sensor Data: A Standardization of the State-of-the-Art

Human activity recognition based on wearable sensor data has been an att...
research
04/09/2018

Fine-grained Activity Recognition in Baseball Videos

In this paper, we introduce a challenging new dataset, MLB-YouTube, desi...
research
07/03/2023

Augmenting Deep Learning Adaptation for Wearable Sensor Data through Combined Temporal-Frequency Image Encoding

Deep learning advancements have revolutionized scalable classification i...
research
02/25/2023

A Preliminary Study on Pattern Reconstruction for Optimal Storage of Wearable Sensor Data

Efficient querying and retrieval of healthcare data is posing a critical...
research
04/13/2020

Sequential Weakly Labeled Multi-Activity Recognition and Location on Wearable Sensors using Recurrent Attention Network

With the popularity and development of the wearable devices such as smar...
research
10/14/2022

MMTSA: Multimodal Temporal Segment Attention Network for Efficient Human Activity Recognition

Multimodal sensors (e.g., visual, non-visual, and wearable) provide comp...

Please sign up or login with your details

Forgot password? Click here to reset