Video SemNet: Memory-Augmented Video Semantic Network

11/22/2020
by   Prashanth Vijayaraghavan, et al.
0

Stories are a very compelling medium to convey ideas, experiences, social and cultural values. Narrative is a specific manifestation of the story that turns it into knowledge for the audience. In this paper, we propose a machine learning approach to capture the narrative elements in movies by bridging the gap between the low-level data representations and semantic aspects of the visual medium. We present a Memory-Augmented Video Semantic Network, called Video SemNet, to encode the semantic descriptors and learn an embedding for the video. The model employs two main components: (i) a neural semantic learner that learns latent embeddings of semantic descriptors and (ii) a memory module that retains and memorizes specific semantic patterns from the video. We evaluate the video representations obtained from variants of our model on two tasks: (a) genre prediction and (b) IMDB Rating prediction. We demonstrate that our model is able to predict genres and IMDB ratings with a weighted F-1 score of 0.72 and 0.63 respectively. The results are indicative of the representational power of our model and the ability of such representations to measure audience engagement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2016

Knowledge Transfer for Scene-specific Motion Prediction

When given a single frame of the video, humans can not only interpret th...
research
05/28/2014

Detection Bank: An Object Detection Based Video Representation for Multimedia Event Recognition

While low-level image features have proven to be effective representatio...
research
03/19/2018

Featureless: Bypassing feature extraction in action categorization

This method introduces an efficient manner of learning action categories...
research
12/08/2021

Semantic TrueLearn: Using Semantic Knowledge Graphs in Recommendation Systems

In informational recommenders, many challenges arise from the need to ha...
research
10/06/2021

Semantic Prediction: Which One Should Come First, Recognition or Prediction?

The ultimate goal of video prediction is not forecasting future pixel-va...
research
07/26/2016

Approximate Policy Iteration for Budgeted Semantic Video Segmentation

This paper formulates and presents a solution to the new problem of budg...
research
09/05/2020

Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability

A key capability of an intelligent system is deciding when events from p...

Please sign up or login with your details

Forgot password? Click here to reset