Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos

12/02/2015
by   Mohamed Elhoseiny, et al.
0

We propose a new zero-shot Event Detection method by Multi-modal Distributional Semantic embedding of videos. Our model embeds object and action concepts as well as other available modalities from videos into a distributional semantic space. To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) automatically determining relevance of concepts/attributes to a free text query, which could be useful for other applications, and (c) retrieving videos by free text event query (e.g., "changing a vehicle tire") based on their content. We embed videos into a distributional semantic space and then measure the similarity between videos and the event query in a free text form. We validated our method on the large TRECVID MED (Multimedia Event Detection) challenge. Using only the event title as a query, our method outperformed the state-of-the-art that uses big descriptions from 12.6 metric. It is also an order of magnitude faster.

READ FULL TEXT

page 1

page 5

research
01/14/2016

Dynamic Concept Composition for Zero-Example Event Detection

In this paper, we focus on automatically detecting events in unconstrain...
research
06/13/2019

Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing

We present a novel deep zero-shot learning (ZSL) model for inferencing h...
research
05/05/2017

Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

Event detection in unconstrained videos is conceived as a content-based ...
research
04/14/2022

The Art of Prompting: Event Detection based on Type Specific Prompts

We compare various forms of prompts to represent event types and develop...
research
06/26/2021

Generalized Zero-Shot Learning using Multimodal Variational Auto-Encoder with Semantic Concepts

With the ever-increasing amount of data, the central challenge in multim...
research
01/02/2019

Action2Vec: A Crossmodal Embedding Approach to Action Learning

We describe a novel cross-modal embedding space for actions, named Actio...
research
11/09/2022

Efficient Zero-shot Event Extraction with Context-Definition Alignment

Event extraction (EE) is the task of identifying interested event mentio...

Please sign up or login with your details

Forgot password? Click here to reset