Video Test-Time Adaptation for Action Recognition

11/24/2022
by   Wei Lin, et al.
0

Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in test data. However, test-time adaptation of video action recognition models against common distribution shifts has so far not been demonstrated. We propose to address this problem with an approach tailored to spatio-temporal models that is capable of adaptation on a single video sample at a step. It consists in a feature distribution alignment technique that aligns online estimates of test set statistics towards the training statistics. We further enforce prediction consistency over temporally augmented views of the same test video sample. Evaluations on three benchmark action recognition datasets show that our proposed technique is architecture-agnostic and able to significantly boost the performance on both, the state of the art convolutional architecture TANet and the Video Swin Transformer. Our proposed method demonstrates a substantial performance gain over existing test-time adaptation approaches in both evaluations of a single distribution shift and the challenging case of random distribution shifts. Code will be available at <https://github.com/wlin-at/ViTTA>.

READ FULL TEXT

page 4

page 15

research
06/06/2023

On Pitfalls of Test-Time Adaptation

Test-Time Adaptation (TTA) has recently emerged as a promising approach ...
research
08/09/2023

GeoAdapt: Self-Supervised Test-Time Adaption in LiDAR Place Recognition Using Geometric Priors

LiDAR place recognition approaches based on deep learning suffer a signi...
research
07/20/2021

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

Deep Metric Learning (DML) aims to find representations suitable for zer...
research
07/06/2023

Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification

Test-time adaptation (TTA) is a technique aimed at enhancing the general...
research
10/10/2022

An Action Is Worth Multiple Words: Handling Ambiguity in Action Recognition

Precisely naming the action depicted in a video can be a challenging and...
research
07/11/2023

EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video

In egocentric action recognition a single population model is typically ...
research
08/26/2020

Effective Action Recognition with Embedded Key Point Shifts

Temporal feature extraction is an essential technique in video-based act...

Please sign up or login with your details

Forgot password? Click here to reset