DeepAI AI Chat
Log In Sign Up

Cross-media Structured Common Space for Multimedia Event Extraction

05/05/2020
by   Manling Li, et al.
University of Illinois at Urbana-Champaign
Columbia University
0

We introduce a new task, MultiMedia Event Extraction (M2E2), which aims to extract events and their arguments from multimedia documents. We develop the first benchmark and collect a dataset of 245 multimedia news articles with extensively annotated events and arguments. We propose a novel method, Weakly Aligned Structured Embedding (WASE), that encodes structured representations of semantic information from textual and visual data into a common embedding space. The structures are aligned across modalities by employing a weakly supervised training strategy, which enables exploiting available resources without explicit cross-media annotation. Compared to uni-modal state-of-the-art methods, our approach achieves 4.0 event argument role labeling and visual event extraction. Compared to state-of-the-art multimedia unstructured representations, we achieve 8.3 5.0 labeling, respectively. By utilizing images, we extract 21.4 mentions than traditional text-only methods.

READ FULL TEXT

page 3

page 5

page 8

page 9

09/27/2021

Joint Multimedia Event Extraction from Video and Article

Visual and textual modalities contribute complementary information about...
11/03/2022

Video Event Extraction via Tracking Visual States of Arguments

Video event extraction aims to detect salient events from a video and id...
05/02/2021

Event Argument Extraction using Causal Knowledge Structures

Event Argument extraction refers to the task of extracting structured in...
01/13/2022

CLIP-Event: Connecting Text and Images with Event Structures

Vision-language (V+L) pretraining models have achieved great success in ...
06/14/2022

Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World

Understanding how events described or shown in multimedia content relate...
09/30/2020

Visual Semantic Multimedia Event Model for Complex Event Detection in Video Streams

Multimedia data is highly expressive and has traditionally been very dif...
08/12/2016

Self-paced Learning for Weakly Supervised Evidence Discovery in Multimedia Event Search

Multimedia event detection has been receiving increasing attention in re...