Probabilistic Semantic Retrieval for Surveillance Videos with Activity Graphs

12/17/2017
by   Yuting Chen, et al.
0

We present a novel framework for finding complex activities matching user-described queries in cluttered surveillance videos. The wide diversity of queries coupled with unavailability of annotated activity data limits our ability to train activity models. To bridge the semantic gap we propose to let users describe an activity as a semantic graph with object attributes and inter-object relationships associated with nodes and edges, respectively. We learn node/edge-level visual predictors during training and, at test-time, propose to retrieve activity by identifying likely locations that match the semantic graph. We formulate a novel CRF based probabilistic activity localization objective that accounts for mis-detections, mis-classifications and track-losses, and outputs a likelihood score for a candidate grounded location of the query in the video. We seek groundings that maximize overall precision and recall. To handle the combinatorial search over all high-probability groundings, we propose a highest precision subtree algorithm. Our method outperforms existing retrieval methods on benchmarked datasets.

READ FULL TEXT

page 1

page 9

research
11/21/2018

MAC: Mining Activity Concepts for Language-based Temporal Localization

We address the problem of language-based temporal localization in untrim...
research
03/08/2022

PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos

Activity detection in surveillance videos is a challenging task caused b...
research
06/18/2020

Video Moment Localization using Object Evidence and Reverse Captioning

We address the problem of language-based temporal localization of moment...
research
05/05/2017

TALL: Temporal Activity Localization via Language Query

This paper focuses on temporal localization of actions in untrimmed vide...
research
03/02/2023

Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos

Temporal sentence localization in videos (TSLV) aims to retrieve the mos...
research
05/01/2014

Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

We present a content-based retrieval method for long surveillance videos...
research
12/03/2016

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Semantic sparsity is a common challenge in structured visual classificat...

Please sign up or login with your details

Forgot password? Click here to reset