The YLI-MED Corpus: Characteristics, Procedures, and Plans

03/13/2015
by   Julia Bernd, et al.
0

The YLI Multimedia Event Detection corpus is a public-domain index of videos with annotations and computed features, specialized for research in multimedia event detection (MED), i.e., automatically identifying what's happening in a video by analyzing the audio and visual content. The videos indexed in the YLI-MED corpus are a subset of the larger YLI feature corpus, which is being developed by the International Computer Science Institute and Lawrence Livermore National Laboratory based on the Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset. The videos in YLI-MED are categorized as depicting one of ten target events, or no target event, and are annotated for additional attributes like language spoken and whether the video has a musical score. The annotations also include degree of annotator agreement and average annotator confidence scores for the event categorization of each video. Version 1.0 of YLI-MED includes 1823 "positive" videos that depict the target events and 48,138 "negative" videos, as well as 177 supplementary videos that are similar to event videos but are not positive examples. Our goal in producing YLI-MED is to be as open about our data and procedures as possible. This report describes the procedures used to collect the corpus; gives detailed descriptive statistics about the corpus makeup (and how video attributes affected annotators' judgments); discusses possible biases in the corpus introduced by our procedural choices and compares it with the most similar existing dataset, TRECVID MED's HAVIC corpus; and gives an overview of our future plans for expanding the annotation effort.

READ FULL TEXT
research
05/24/2016

EventNet Version 1.1 Technical Report

EventNet is a large-scale video corpus and event ontology consisting of ...
research
09/27/2021

Joint Multimedia Event Extraction from Video and Article

Visual and textual modalities contribute complementary information about...
research
05/05/2017

Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

Event detection in unconstrained videos is conceived as a content-based ...
research
06/15/2023

Towards Long Form Audio-visual Video Understanding

We live in a world filled with never-ending streams of multimodal inform...
research
01/15/2020

EEV Dataset: Predicting Expressions Evoked by Diverse Videos

When we watch videos, the visual and auditory information we experience ...
research
08/18/2023

Audiovisual Moments in Time: A Large-Scale Annotated Dataset of Audiovisual Actions

We present Audiovisual Moments in Time (AVMIT), a large-scale dataset of...
research
08/27/2020

Rate distortion optimization over large scale video corpus with machine learning

We present an efficient codec-agnostic method for bitrate allocation ove...

Please sign up or login with your details

Forgot password? Click here to reset