Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network

09/15/2014
by   Yu-Chuan Su, et al.
0

Unconstrained video recognition and Deep Convolution Network (DCN) are two active topics in computer vision recently. In this work, we apply DCNs as frame-based recognizers for video recognition. Our preliminary studies, however, show that video corpora with complete ground truth are usually not large and diverse enough to learn a robust model. The networks trained directly on the video data set suffer from significant overfitting and have poor recognition rate on the test set. The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult. To overcome the problem, we perform transfer learning from images to videos to utilize the knowledge in the weakly labeled image corpus for video recognition. The image corpus help to learn important visual patterns for natural images, while these patterns are ignored by models trained only on the video corpus. Therefore, the resultant networks have better generalizability and better recognition rate. We show that by means of transfer learning from image to video, we can learn a frame-based recognizer with only 4k videos. Because the image corpus is weakly labeled, the entire learning process requires only 4k annotated instances, which is far less than the million scale image data sets required by previous works. The same approach may be applied to other visual recognition tasks where only scarce training data is available, and it improves the applicability of DCNs in various computer vision problems. Our experiments also reveal the correlation between meta-parameters and the performance of DCNs, given the properties of the target problem and data. These results lead to a heuristic for meta-parameter selection for future researches, which does not rely on the time consuming meta-parameter search.

READ FULL TEXT

page 1

page 5

page 7

page 9

page 11

research
08/03/2017

Attention Transfer from Web Images for Video Recognition

Training deep learning based video classifiers for action recognition re...
research
06/07/2016

Semi-Supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation

Deep convolutional neural networks (CNNs) have been immensely successful...
research
12/18/2015

Can Pretrained Neural Networks Detect Anatomy?

Convolutional neural networks demonstrated outstanding empirical results...
research
07/01/2020

Exploiting the Logits: Joint Sign Language Recognition and Spell-Correction

Machine learning techniques have excelled in the automatic semantic anal...
research
11/29/2018

Progressive Recurrent Learning for Visual Recognition

Computer vision is difficult, partly because the mathematical function c...
research
02/09/2020

GradMix: Multi-source Transfer across Domains and Tasks

The computer vision community is witnessing an unprecedented rate of new...
research
04/28/2022

Model Selection, Adaptation, and Combination for Deep Transfer Learning through Neural Networks in Renewable Energies

There is recent interest in using model hubs, a collection of pre-traine...

Please sign up or login with your details

Forgot password? Click here to reset