Movie Box Office Prediction With Self-Supervised and Visually Grounded Pretraining

04/20/2023
by   Qin Chao, et al.
0

Investments in movie production are associated with a high level of risk as movie revenues have long-tailed and bimodal distributions. Accurate prediction of box-office revenue may mitigate the uncertainty and encourage investment. However, learning effective representations for actors, directors, and user-generated content-related keywords remains a challenging open problem. In this work, we investigate the effects of self-supervised pretraining and propose visual grounding of content keywords in objects from movie posters as a pertaining objective. Experiments on a large dataset of 35,794 movies demonstrate significant benefits of self-supervised training and visual grounding. In particular, visual grounding pretraining substantially improves learning on movies with content keywords and achieves 14.5 performance gains compared to a finetuned BERT model with identical architecture.

READ FULL TEXT

page 3

page 4

page 5

page 7

page 8

page 9

research
03/19/2021

Efficient Visual Pretraining with Contrastive Detection

Self-supervised pretraining has been shown to yield powerful representat...
research
12/07/2021

Auxiliary Learning for Self-Supervised Video Representation via Similarity-based Knowledge Distillation

Despite the outstanding success of self-supervised pretraining methods f...
research
12/08/2020

CASTing Your Model: Learning to Localize Improves Self-Supervised Representations

Recent advances in self-supervised learning (SSL) have largely closed th...
research
06/26/2023

Learning with Difference Attention for Visually Grounded Self-supervised Representations

Recent works in self-supervised learning have shown impressive results o...
research
04/29/2020

Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube

Pretraining from unlabelled web videos has quickly become the de-facto m...
research
05/19/2023

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

In this paper, we show that representations capturing syllabic units eme...
research
04/03/2018

Predicting Gross Movie Revenue

'There is no terror in the bang, only is the anticipation of it' - Alfre...

Please sign up or login with your details

Forgot password? Click here to reset