Will You Ever Become Popular? Learning to Predict Virality of Dance Clips

11/06/2021
by   Jiahao Wang, et al.
1

Dance challenges are going viral in video communities like TikTok nowadays. Once a challenge becomes popular, thousands of short-form videos will be uploaded in merely a couple of days. Therefore, virality prediction from dance challenges is of great commercial value and has a wide range of applications, such as smart recommendation and popularity promotion. In this paper, a novel multi-modal framework which integrates skeletal, holistic appearance, facial and scenic cues is proposed for comprehensive dance virality prediction. To model body movements, we propose a pyramidal skeleton graph convolutional network (PSGCN) which hierarchically refines spatio-temporal skeleton graphs. Meanwhile, we introduce a relational temporal convolutional network (RTCN) to exploit appearance dynamics with non-local temporal relations. An attentive fusion approach is finally proposed to adaptively aggregate predictions from different modalities. To validate our method, we introduce a large-scale viral dance video (VDV) dataset, which contains over 4,000 dance clips of eight viral dance challenges. Extensive experiments on the VDV dataset demonstrate the efficacy of our model. Extensive experiments on the VDV dataset well demonstrate the effectiveness of our approach. Furthermore, we show that short video applications like multi-dimensional recommendation and action feedback can be derived from our model.

READ FULL TEXT

page 2

page 5

page 14

page 15

page 16

page 17

page 18

page 19

research
11/10/2018

Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention

Benefiting from its succinctness and robustness, skeleton-based human ac...
research
10/29/2021

Visual Spatio-Temporal Relation-Enhanced Network for Cross-Modal Text-Video Retrieval

The task of cross-modal retrieval between texts and videos aims to under...
research
02/27/2018

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

Variations of human body skeletons may be considered as dynamic graphs, ...
research
02/08/2019

Skeleton-Based Online Action Prediction Using Scale Selection Network

Action prediction is to recognize the class label of an ongoing activity...
research
10/12/2021

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

This paper focuses on tackling the problem of temporal language localiza...
research
05/20/2021

A Spatio-temporal Attention-based Model for Infant Movement Assessment from Videos

The absence or abnormality of fidgety movements of joints or limbs is st...
research
10/23/2020

Short Video-based Advertisements Evaluation System: Self-Organizing Learning Approach

With the rising of short video apps, such as TikTok, Snapchat and Kwai, ...

Please sign up or login with your details

Forgot password? Click here to reset