Beyond Double Ascent via Recurrent Neural Tangent Kernel in Sequential Recommendation

by   Ruihong Qiu, et al.

Overfitting has long been considered a common issue to large neural network models in sequential recommendation. In our study, an interesting phenomenon is observed that overfitting is temporary. When the model scale is increased, the trend of the performance firstly ascends, then descends (i.e., overfitting) and finally ascends again, which is named as double ascent in this paper. We therefore raise an assumption that a considerably larger model will generalise better with a higher performance. In an extreme case to infinite-width, performance is expected to reach the limit of this specific structure. Unfortunately, it is impractical to directly build a huge model due to the limit of resources. In this paper, we propose the Overparameterised Recommender (OverRec), which utilises a recurrent neural tangent kernel (RNTK) as a similarity measurement for user sequences to successfully bypass the restriction of hardware for huge models. We further prove that the RNTK for the tied input-output embeddings in recommendation is the same as the RNTK for general untied input-output embeddings, which makes RNTK theoretically suitable for recommendation. Since the RNTK is analytically derived, OverRec does not require any training, avoiding physically building the huge model. Extensive experiments are conducted on four datasets, which verifies the state-of-the-art performance of OverRec.


page 1

page 9


Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks

Recommendation algorithms that incorporate techniques from deep learning...

Realization Theory Of Recurrent Neural ODEs Using Polynomial System Embeddings

In this paper we show that neural ODE analogs of recurrent (ODE-RNN) and...

The Recurrent Neural Tangent Kernel

The study of deep networks (DNs) in the infinite-width limit, via the so...

PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

The common item-based collaborative filtering framework becomes a typica...

Context-aware Sequential Recommendation

Since sequential information plays an important role in modeling user be...

Towards Understanding the Overfitting Phenomenon of Deep Click-Through Rate Prediction Models

Deep learning techniques have been applied widely in industrial recommen...

Associative Memory in Iterated Overparameterized Sigmoid Autoencoders

Recent work showed that overparameterized autoencoders can be trained to...

Please sign up or login with your details

Forgot password? Click here to reset