AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks

by   Weiping Song, et al.

Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking an ad or an item, is critical to many online applications such as online advertising and recommender systems. The problem is very challenging since (1) the input features (e.g., the user id, user age, item id, item category) are usually sparse and high-dimensional, and (2) an effective prediction relies on high-order combinatorial features (a.k.a. cross features), which are very time-consuming to hand-craft by domain experts and are impossible to be enumerated. Therefore, there have been efforts in finding low-dimensional representations of the sparse and high-dimensional raw features and their meaningful combinations. In this paper, we propose an effective and efficient algorithm to automatically learn the high-order feature combinations of input features. Our proposed algorithm is very general, which can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterward, a multi-head self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the low-dimensional space. With different layers of the multi-head self-attentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion. Experimental results on four real-world datasets show that our proposed approach not only outperforms existing state-of-the-art approaches for prediction but also offers good explainability.


page 1

page 2

page 3

page 4


Disentangled Self-Attentive Neural Networks for Click-Through Rate Prediction

Click-through rate (CTR) prediction, which aims to predict the probabili...

A Deterministic Self-Organizing Map Approach and its Application on Satellite Data based Cloud Type Classification

A self-organizing map (SOM) is a type of competitive artificial neural n...

STEC: See-Through Transformer-based Encoder for CTR Prediction

Click-Through Rate (CTR) prediction holds a pivotal place in online adve...

Attention-based Multimodal Feature Representation Model for Micro-video Recommendation

In recommender systems, models mostly use a combination of embedding lay...

SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing

Deep learning provides a promising way to extract effective representati...

Deep Multi-Representation Model for Click-Through Rate Prediction

Click-Through Rate prediction (CTR) is a crucial task in recommender sys...

Gradient-based Training of Slow Feature Analysis by Differentiable Approximate Whitening

This paper proposes Power Slow Feature Analysis, a gradient-based method...

Please sign up or login with your details

Forgot password? Click here to reset