ContentWise Impressions: An Industrial Dataset with Impressions Included

In this article, we introduce the ContentWise Impressions dataset, a collection of implicit interactions and impressions of movies and TV series from an Over-The-Top media service, which delivers its media contents over the Internet. The dataset is distinguished from other already available multimedia recommendation datasets by the availability of impressions, i.e., the recommendations shown to the user, its size, and by being open-source. We describe the data collection process, the preprocessing applied, its characteristics, and statistics when compared to other commonly used datasets. We also highlight several possible use cases and research questions that can benefit from the availability of user impressions in an open-source dataset. Furthermore, we release software tools to load and split the data, as well as examples of how to use both user interactions and impressions in several common recommendation algorithms.

READ FULL TEXT

page 2

page 5

research
04/08/2021

Media Cloud: Massive Open Source Collection of Global News on the Open Web

We present the first full description of Media Cloud, an open source pla...
research
10/06/2020

Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq

High-quality and large-scale data are key to success for AI systems. How...
research
09/01/2022

MTS Kion Implicit Contextualised Sequential Dataset for Movie Recommendation

We present a new movie and TV show recommendation dataset collected from...
research
10/04/2019

The Open Porous Media Flow Reservoir Simulator

The Open Porous Media (OPM) initiative is a community effort that encour...
research
10/30/2018

An architecture of open-source tools to combine textual information extraction, faceted search and information visualisation

This article presents our steps to integrate complex and partly unstruct...
research
01/12/2023

Mephisto: A Framework for Portable, Reproducible, and Iterative Crowdsourcing

We introduce Mephisto, a framework to make crowdsourcing for research mo...
research
01/10/2020

GeoCMS : Towards a Geo-Tagged Media Management System

In this paper, we propose the design and implementation of the new geota...

Please sign up or login with your details

Forgot password? Click here to reset