Correlation Net : spatio temporal multimodal deep learning

07/22/2018
by   Novanto Yudistira, et al.
0

This paper describes a network that is able to capture spatiotemporal correlations over arbitrary periods of time. The proposed scheme operates as a complementary, extended network over spatiotemporal regions. Recently, multimodal fusion has been extensively researched in deep learning. For action recognition, the spatial and temporal streams are vital components of deep Convolutional Neural Network (CNNs), but reducing the occurrence of overfitting and fusing these two streams remain open problems. The existing fusion approach is to average the two streams. To this end, we propose a correlation network with a Shannon regularizer to learn a CNN that has already been trained. Long-range video may consist of spatiotemporal correlation over arbitrary periods of time. This correlation can be captured using simple fully connected layers to form the correlation network. This is found to be complementary to the existing network fusion methods. We evaluate our approach on the UCF-101 and HMDB-51 datasets, and the resulting improvement in accuracy demonstrates the importance of multimodal correlation.

READ FULL TEXT
research
03/04/2019

Spatiotemporal Pyramid Network for Video Action Recognition

Two-stream convolutional networks have shown strong performance in video...
research
04/22/2016

Convolutional Two-Stream Network Fusion for Video Action Recognition

Recent applications of Convolutional Neural Networks (ConvNets) for huma...
research
04/10/2020

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

Despite the success in still image recognition, deep neural networks for...
research
07/25/2019

Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization

This technical report present an overview of our system proposed for the...
research
08/01/2016

Exploiting Temporal Information for DCNN-based Fine-Grained Object Classification

Fine-grained classification is a relatively new field that has concentra...
research
09/12/2017

Learning Gating ConvNet for Two-Stream based Methods in Action Recognition

For the two-stream style methods in action recognition, fusing the two s...
research
08/28/2023

Evaluation of Key Spatiotemporal Learners for Print Track Anomaly Classification Using Melt Pool Image Streams

Recent applications of machine learning in metal additive manufacturing ...

Please sign up or login with your details

Forgot password? Click here to reset