Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions

10/13/2021
by   Chenyu Yi, et al.
0

The state-of-the-art deep neural networks are vulnerable to common corruptions (e.g., input data degradations, distortions, and disturbances caused by weather changes, system error, and processing). While much progress has been made in analyzing and improving the robustness of models in image understanding, the robustness in video understanding is largely unexplored. In this paper, we establish a corruption robustness benchmark, Mini Kinetics-C and Mini SSV2-C, which considers temporal corruptions beyond spatial corruptions in images. We make the first attempt to conduct an exhaustive study on the corruption robustness of established CNN-based and Transformer-based spatial-temporal models. The study provides some guidance on robust model design and training: Transformer-based model performs better than CNN-based models on corruption robustness; the generalization ability of spatial-temporal models implies robustness against temporal corruptions; model corruption robustness (especially robustness in the temporal domain) enhances with computational cost and model capacity, which may contradict the current trend of improving the computational efficiency of models. Moreover, we find the robustness intervention for image-related tasks (e.g., training models with noise) may not work for spatial-temporal models.

READ FULL TEXT

page 3

page 20

page 27

page 28

page 29

page 30

research
12/06/2022

Domain Generalization Strategy to Train Classifiers Robust to Spatial-Temporal Shift

Deep learning-based weather prediction models have advanced significantl...
research
04/17/2022

VDTR: Video Deblurring with Transformer

Video deblurring is still an unsolved problem due to the challenging spa...
research
07/04/2022

Large-scale Robustness Analysis of Video Action Recognition Models

We have seen a great progress in video action recognition in recent year...
research
09/28/2019

Grouped Spatial-Temporal Aggregation for Efficient Action Recognition

Temporal reasoning is an important aspect of video analysis. 3D CNN show...
research
02/28/2023

Temporal Coherent Test-Time Optimization for Robust Video Classification

Deep neural networks are likely to fail when the test data is corrupted ...
research
09/18/2022

Active Defense Analysis of Blockchain Forking through the Spatial-Temporal Lens

Forking breaches the security and performance of blockchain as it is sym...
research
06/27/2022

Explicitly incorporating spatial information to recurrent networks for agriculture

In agriculture, the majority of vision systems perform still image class...

Please sign up or login with your details

Forgot password? Click here to reset