Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer

02/17/2023
by   Nguyen Huu Phong, et al.
0

Human Action Recognition (HAR) involves the task of categorizing actions present in video sequences. Although it presents interesting problems, it remains one of the most challenging domains in pattern recognition. Convolutional Neural Networks (ConvNets) have demonstrated exceptional success in image recognition and related areas. However, these advanced techniques are not always directly applicable to HAR, as the consideration of temporal features is crucial. In this paper, we present a dynamic PSO-ConvNet model for learning actions in video, drawing on our recent research in image recognition. Our methods are based on a framework where the weight vector of each neural network serves as the position of a particle in phase space, and particles exchange their current weight vectors and gradient estimates of the Loss function. We extend the approach to video by integrating a ConvNet with state-of-the-art temporal methods such as Transformer and Recurrent Neural Networks. The results reveal substantial advancements, with improvements of up to 9 https://github.com/leonlha/Video-Action-Recognition-via-PSO-ConvNet-Transformer-Collaborative-Learning-with-Dynamics.

READ FULL TEXT

page 1

page 5

page 8

page 9

research
04/19/2019

Temporal Unet: Sample Level Human Action Recognition using WiFi

Human doing actions will result in WiFi distortion, which is widely expl...
research
05/20/2022

PSO-Convolutional Neural Networks with Heterogeneous Learning Rate

Convolutional Neural Networks (ConvNets or CNNs) have been candidly depl...
research
06/09/2021

Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition

With the recent surge in the research of vision transformers, they have ...
research
06/26/2021

An Image Classifier Can Suffice For Video Understanding

We propose a new perspective on video understanding by casting the video...
research
03/22/2017

Two-Stream RNN/CNN for Action Recognition in 3D Videos

The recognition of actions from video sequences has many applications in...
research
02/01/2021

Video Transformer Network

This paper presents VTN, a transformer-based framework for video recogni...
research
04/30/2019

Memory-Augmented Temporal Dynamic Learning for Action Recognition

Human actions captured in video sequences contain two crucial factors fo...

Please sign up or login with your details

Forgot password? Click here to reset