SVIP: Sequence VerIfication for Procedures in Videos

12/13/2021
by   Yicheng Qian, et al.
0

In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task. Such a challenging task resides in an open-set setting without prior action detection or segmentation that requires event-level or even frame-level annotations. To that end, we carefully reorganize two publicly available action-related datasets with step-procedure-task structure. To fully investigate the effectiveness of any method, we collect a scripted video dataset enumerating all kinds of step-level transformations in chemical experiments. Besides, a novel evaluation metric Weighted Distance Ratio is introduced to ensure equivalence for different step-level transformations during evaluation. In the end, a simple but effective baseline based on the transformer with a novel sequence alignment loss is introduced to better characterize long-term dependency between steps, which outperforms other action recognition methods. Codes and data will be released.

READ FULL TEXT

page 1

page 3

page 8

page 9

page 17

page 18

page 19

research
04/19/2019

Temporal Unet: Sample Level Human Action Recognition using WiFi

Human doing actions will result in WiFi distortion, which is widely expl...
research
07/05/2023

Task-Specific Alignment and Multiple Level Transformer for Few-Shot Action Recognition

In the research field of few-shot learning, the main difference between ...
research
09/23/2022

Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model

Fight detection in videos is an emerging deep learning application with ...
research
06/02/2014

Continuous Action Recognition Based on Sequence Alignment

Continuous action recognition is more challenging than isolated recognit...
research
09/01/2022

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation

This paper introduces a unified framework for video action segmentation ...
research
02/23/2021

STEP: Segmenting and Tracking Every Pixel

In this paper, we tackle video panoptic segmentation, a task that requir...
research
05/27/2023

Non-Sequential Graph Script Induction via Multimedia Grounding

Online resources such as WikiHow compile a wide range of scripts for per...

Please sign up or login with your details

Forgot password? Click here to reset