BERT4CTR: An Efficient Framework to Combine Pre-trained Language Model with Non-textual Features for CTR Prediction

08/17/2023
by   Dong Wang, et al.
0

Although deep pre-trained language models have shown promising benefit in a large set of industrial scenarios, including Click-Through-Rate (CTR) prediction, how to integrate pre-trained language models that handle only textual signals into a prediction pipeline with non-textual features is challenging. Up to now two directions have been explored to integrate multi-modal inputs in fine-tuning of pre-trained language models. One consists of fusing the outcome of language models and non-textual features through an aggregation layer, resulting into ensemble framework, where the cross-information between textual and non-textual inputs are only learned in the aggregation layer. The second one consists of splitting non-textual features into fine-grained fragments and transforming the fragments to new tokens combined with textual ones, so that they can be fed directly to transformer layers in language models. However, this approach increases the complexity of the learning and inference because of the numerous additional tokens. To address these limitations, we propose in this work a novel framework BERT4CTR, with the Uni-Attention mechanism that can benefit from the interactions between non-textual and textual features while maintaining low time-costs in training and inference through a dimensionality reduction. Comprehensive experiments on both public and commercial data demonstrate that BERT4CTR can outperform significantly the state-of-the-art frameworks to handle multi-modal inputs and be applicable to CTR prediction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2023

CTRL: Connect Tabular and Language Model for CTR Prediction

Traditional click-through rate (CTR) prediction models convert the tabul...
research
04/19/2021

Probing for Bridging Inference in Transformer Language Models

We probe pre-trained transformer language models for bridging inference....
research
05/28/2022

Few-shot Subgoal Planning with Language Models

Pre-trained large language models have shown successful progress in many...
research
03/28/2023

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

We present LLaMA-Adapter, a lightweight adaption method to efficiently f...
research
10/26/2022

A Transformer-based Framework for POI-level Social Post Geolocation

POI-level geo-information of social posts is critical to many location-b...
research
03/12/2023

DTT: An Example-Driven Tabular Transformer by Leveraging Large Language Models

Many organizations rely on data from government and third-party sources,...
research
06/01/2023

Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting

Pre-trained language models (PLMs) have played an increasing role in mul...

Please sign up or login with your details

Forgot password? Click here to reset