PHN: Parallel heterogeneous network with soft gating for CTR prediction

06/18/2022
by   Ri Su, et al.
0

The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide & deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation function of output layer, the linear addition activation value of parallel structures in the training process is easy to make the samples fall into the weak gradient interval, resulting in the phenomenon of weak gradient, and reducing the effectiveness of training. To this end, this paper proposes a Parallel Heterogeneous Network (PHN) model, which constructs a network with parallel structure through three different interaction analysis methods, and uses Soft Selection Gating (SSG) to feature heterogeneous data with different structure. Finally, residual link with trainable parameters are used in the network to mitigate the influence of weak gradient phenomenon. Furthermore, we demonstrate the effectiveness of PHN in a large number of comparative experiments, and visualize the performance of the model in training process and structure.

READ FULL TEXT
research
03/04/2023

Lon-eå at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

We study the influence of different activation functions in the output l...
research
02/08/2019

A simple and efficient architecture for trainable activation functions

Learning automatically the best activation function for the task is an a...
research
07/24/2023

DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning

Recommendation system algorithm based on multi-task learning (MTL) is th...
research
04/12/2018

Asynchronous Parallel Sampling Gradient Boosting Decision Tree

With the development of big data technology, Gradient Boosting Decision ...
research
10/01/2017

Pyramidal RoR for Image Classification

The Residual Networks of Residual Networks (RoR) exhibits excellent perf...
research
03/01/2020

Soft-Root-Sign Activation Function

The choice of activation function in deep networks has a significant eff...
research
12/10/2021

Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

Gradient-based methods for the distributed training of residual networks...

Please sign up or login with your details

Forgot password? Click here to reset