ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer

06/26/2023
by   Jiaxin Deng, et al.
0

In recent years, live streaming platforms have gained immense popularity as they allow users to broadcast their videos and interact in real-time with hosts and peers. Due to the dynamic changes of live content, accurate recommendation models are crucial for enhancing user experience. However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction framework on item-level, neglecting that the dynamic changes that occur even within the same live room. In this paper, we proposed a ContentCTR model that leverages multimodal transformer for frame-level CTR prediction. First, we present an end-to-end framework that can make full use of multimodal information, including visual frames, audio, and comments, to identify the most attractive live frames. Second, to prevent the model from collapsing into a mediocre solution, a novel pairwise loss function with first-order difference constraints is proposed to utilize the contrastive information existing in the highlight and non-highlight frames. Additionally, we design a temporal text-video alignment module based on Dynamic Time Warping to eliminate noise caused by the ambiguity and non-sequential alignment of visual and textual information. We conduct extensive experiments on both real-world scenarios and public datasets, and our ContentCTR model outperforms traditional recommendation models in capturing real-time content changes. Moreover, we deploy the proposed method on our company platform, and the results of online A/B testing further validate its practical significance.

READ FULL TEXT

page 3

page 10

research
02/07/2020

Multimodal Matching Transformer for Live Commenting

Automatic live commenting aims to provide real-time comments on videos f...
research
03/07/2018

Facebook (A)Live? Are live social broadcasts really broadcasts?

The era of live-broadcast is back but with two major changes. First, unl...
research
09/11/2022

Tutorial Recommendation for Livestream Videos using Discourse-Level Consistency and Ontology-Based Filtering

Streaming videos is one of the methods for creators to share their creat...
research
07/05/2022

Multimodal Frame-Scoring Transformer for Video Summarization

As the number of video content has mushroomed in recent years, automatic...
research
04/25/2023

OFAR: A Multimodal Evidence Retrieval Framework for Illegal Live-streaming Identification

Illegal live-streaming identification, which aims to help live-streaming...
research
10/06/2020

Online Action Detection in Streaming Videos with Time Buffers

We formulate the problem of online temporal action detection in live str...
research
01/04/2021

Personal Privacy Protection via Irrelevant Faces Tracking and Pixelation in Video Live Streaming

To date, the privacy-protection intended pixelation tasks are still labo...

Please sign up or login with your details

Forgot password? Click here to reset