Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

07/20/2022
by   Huabin Liu, et al.
6

A primary challenge faced in few-shot action recognition is inadequate video data for training. To address this issue, current methods in this field mainly focus on devising algorithms at the feature level while little attention is paid to processing input video data. Moreover, existing frame sampling strategies may omit critical action information in temporal and spatial dimensions, which further impacts video utilization efficiency. In this paper, we propose a novel video frame sampler for few-shot action recognition to address this issue, where task-specific spatial-temporal frame sampling is achieved via a temporal selector (TS) and a spatial amplifier (SA). Specifically, our sampler first scans the whole video at a small computational cost to obtain a global perception of video frames. The TS plays its role in selecting top-T frames that contribute most significantly and subsequently. The SA emphasizes the discriminative information of each frame by amplifying critical regions with the guidance of saliency maps. We further adopt task-adaptive learning to dynamically adjust the sampling strategy according to the episode task at hand. Both the implementations of TS and SA are differentiable for end-to-end optimization, facilitating seamless integration of our proposed sampler with most few-shot action recognition methods. Extensive experiments show a significant boost in the performances on various benchmarks including long-term videos.

READ FULL TEXT

page 2

page 3

page 7

page 11

research
05/10/2023

Few-shot Action Recognition via Intra- and Inter-Video Information Maximization

Current few-shot action recognition involves two primary sources of info...
research
09/30/2022

Alignment-guided Temporal Attention for Video Action Recognition

Temporal modeling is crucial for various video learning tasks. Most rece...
research
07/27/2023

Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

Training an effective video action recognition model poses significant c...
research
07/13/2023

Free-Form Composition Networks for Egocentric Action Recognition

Egocentric action recognition is gaining significant attention in the fi...
research
08/15/2018

Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos

Video Analytics Software as a Service (VA SaaS) has been rapidly growing...
research
08/14/2023

On the Importance of Spatial Relations for Few-shot Action Recognition

Deep learning has achieved great success in video recognition, yet still...
research
04/20/2021

MGSampler: An Explainable Sampling Strategy for Video Action Recognition

Frame sampling is a fundamental problem in video action recognition due ...

Please sign up or login with your details

Forgot password? Click here to reset