Predicting Visual Context for Unsupervised Event Segmentation in Continuous Photo-streams

08/07/2018
by   Ana García del Molino, et al.
2

Segmenting video content into events provides semantic structures for indexing, retrieval, and summarization. Since motion cues are not available in continuous photo-streams, and annotations in lifelogging are scarce and costly, the frames are usually clustered into events by comparing the visual features between them in an unsupervised way. However, such methodologies are ineffective to deal with heterogeneous events, e.g. taking a walk, and temporary changes in the sight direction, e.g. at a meeting. To address these limitations, we propose Contextual Event Segmentation (CES), a novel segmentation paradigm that uses an LSTM-based generative network to model the photo-stream sequences, predict their visual context, and track their evolution. CES decides whether a frame is an event boundary by comparing the visual context generated from the frames in the past, to the visual context predicted from the future. We implemented CES on a new and massive lifelogging dataset consisting of more than 1.5 million images spanning over 1,723 days. Experiments on the popular EDUB-Seg dataset show that our model outperforms the state-of-the-art by over 16 only 3 points below that of human annotators.

READ FULL TEXT

page 1

page 6

page 7

research
12/22/2015

SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation

While wearable cameras are becoming increasingly popular, locating relev...
research
12/29/2016

The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process

Many events occur in the world. Some event types are stochastically exci...
research
02/15/2023

CERiL: Continuous Event-based Reinforcement Learning

This paper explores the potential of event cameras to enable continuous ...
research
04/15/2022

Event-aided Direct Sparse Odometry

We introduce EDS, a direct monocular visual odometry using events and fr...
research
09/07/2022

Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework for Video Anomaly Detection

Video anomaly detection aims to find the events in a video that do not c...
research
02/15/2023

ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence

Our work examines the way in which large language models can be used for...
research
09/05/2017

Towards social pattern characterization in egocentric photo-streams

Following the increasingly popular trend of social interaction analysis ...

Please sign up or login with your details

Forgot password? Click here to reset