SegCodeNet: Color-Coded Segmentation Masks for Activity Detection from Wearable Cameras

08/19/2020
by   Asif Shahriyar Sushmit, et al.
16

Activity detection from first-person videos (FPV) captured using a wearable camera is an active research field with potential applications in many sectors, including healthcare, law enforcement, and rehabilitation. State-of-the-art methods use optical flow-based hybrid techniques that rely on features derived from the motion of objects from consecutive frames. In this work, we developed a two-stream network, the SegCodeNet, that uses a network branch containing video-streams with color-coded semantic segmentation masks of relevant objects in addition to the original RGB video-stream. We also include a stream-wise attention gating that prioritizes between the two streams and a frame-wise attention module that prioritizes the video frames that contain relevant features. Experiments are conducted on an FPV dataset containing 18 activity classes in office environments. In comparison to a single-stream network, the proposed two-stream method achieves an absolute improvement of 14.366% and 10.324% for averaged F1 score and accuracy, respectively, when average results are compared for three different frame sizes 224×224, 112×112, and 64×64. The proposed method provides significant performance gains for lower-resolution images with absolute improvements of 17% and 26% in F1 score for input dimensions of 112×112 and 64×64, respectively. The best performance is achieved for a frame size of 224×224 yielding an F1 score and accuracy of 90.176% and 90.799% which outperforms the state-of-the-art Inflated 3D ConvNet (I3D) <cit.> method by an absolute margin of 4.529% and 2.419%, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 7

research
12/19/2018

D3D: Distilled 3D Networks for Video Action Recognition

State-of-the-art methods for video action recognition commonly use an en...
research
03/28/2019

Road User Detection in Videos

Successive frames of a video are highly redundant, and the most popular ...
research
06/11/2020

Privacy-Aware Activity Classification from First Person Office Videos

In the advent of wearable body-cameras, human activity classification fr...
research
08/05/2016

OpenCL-accelerated object classification in video streams using Spatial Pooler of Hierarchical Temporal Memory

We present a method to classify objects in video streams using a brain-i...
research
07/12/2019

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

Current state-of-the-art methods for image segmentation form a dense ima...
research
09/06/2019

Running Event Visualization using Videos from Multiple Cameras

Visualizing the trajectory of multiple runners with videos collected at ...
research
09/10/2016

Using Spatial Pooler of Hierarchical Temporal Memory to classify noisy videos with predefined complexity

This paper examines the performance of a Spatial Pooler (SP) of a Hierar...

Please sign up or login with your details

Forgot password? Click here to reset