Log In Sign Up

Cross-Enhancement Transform Two-Stream 3D ConvNets for Pedestrian Action Recognition of Autonomous Vehicles

by   Dong Cao, et al.

Action recognition is an important research topic in machine vision. It is widely used in many fields and is one of the key technologies in pedestrian behavior recognition and intention prediction in the field of autonomous driving. Based on the widely used 3D ConvNets algorithm, combined with Two-Stream Inflated algorithm and transfer learning algorithm, we construct a Cross-Enhancement Transform based Two-Stream 3D ConvNets algorithm. On the datasets with different data distribution characteristics, the performance of the algorithm is different, especially the performance of the RGB and optical flow stream in the two stream is different. For this case, we combine the data distribution characteristics on the specific dataset. As a teaching model, the stream with better performance in the two stream is used to assist in training another stream, and then two stream inference is made. We conducted experiments on the UCF-101, HMDB-51, and Kinetics data sets, and the experimental results confirmed the effectiveness of our algorithm.


page 1

page 2

page 3

page 4


Bypass Enhancement RGB Stream Model for Pedestrian Action Recognition of Autonomous Vehicles

Pedestrian action recognition and intention prediction are one of the co...

Investigation on Combining 3D Convolution of Image Data and Optical Flow to Generate Temporal Action Proposals

In this paper, a novel two-stream architecture for the task of temporal ...

Recognition and 3D Localization of Pedestrian Actions from Monocular Video

Understanding and predicting pedestrian behavior is an important and cha...

Temporal Hockey Action Recognition via Pose and Optical Flows

Recognizing actions in ice hockey using computer vision poses challenges...

Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting

Instance-level contrastive learning techniques, which rely on data augme...

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

Light field data exhibit favorable characteristics conducive to saliency...