Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization

07/27/2015
by   Xun Xu, et al.
0

The growing rate of public space CCTV installations has generated a need for automated methods for exploiting video surveillance data including scene understanding, query, behaviour annotation and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes, or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality, and sharing any supervised annotations, between different scenes is however challenging due to: Some scenes are totally un-related -- and thus any information sharing between them would be detrimental; while others may only share a subset of common activities -- and thus information sharing is only useful if it is selective. Moreover, semantically similar activities which should be modelled together and shared across scenes may have quite different pixel-level appearance in each scene. To address these issues we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviours; and further discovers which subset of activities are shared versus scene-specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks including scene activity understanding, cross-scene query-by-example, behaviour classification with reduced supervised labelling requirements, and video summarization. In each case we demonstrate how our multi-scene model improves on a collection of standard single scene models and a flat model of all scenes.

READ FULL TEXT

page 2

page 3

page 6

page 8

page 9

page 10

page 13

page 15

research
11/24/2022

Self Supervised Clustering of Traffic Scenes using Graph Representations

Examining graphs for similarity is a well-known challenge, but one that ...
research
11/24/2020

DADNN: Multi-Scene CTR Prediction via Domain-Aware Deep Neural Network

Click through rate(CTR) prediction is a core task in advertising systems...
research
12/08/2022

PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data

Action recognition models have achieved impressive results by incorporat...
research
07/29/2012

A Survey Of Activity Recognition And Understanding The Behavior In Video Survelliance

This paper presents a review of human activity recognition and behaviour...
research
09/08/2018

CNNs for Surveillance Footage Scene Classification

In this project, we adapt high-performing CNN architectures to different...
research
12/14/2015

We Are Humor Beings: Understanding and Predicting Visual Humor

Humor is an integral part of human lives. Despite being tremendously imp...
research
04/04/2018

Representing Videos based on Scene Layouts for Recognizing Agent-in-Place Actions

We address the recognition of agent-in-place actions, which are associat...

Please sign up or login with your details

Forgot password? Click here to reset