GUIM – General User and Item Embedding with Mixture of Representation in E-commerce

07/02/2022
by   Chao Yang, et al.
0

Our goal is to build general representation (embedding) for each user and each product item across Alibaba's businesses, including Taobao and Tmall which are among the world's biggest e-commerce websites. The representation of users and items has been playing a critical role in various downstream applications, including recommendation system, search, marketing, demand forecasting and so on. Inspired from the BERT model in natural language processing (NLP) domain, we propose a GUIM (General User Item embedding with Mixture of representation) model to achieve the goal with massive, structured, multi-modal data including the interactions among hundreds of millions of users and items. We utilize mixture of representation (MoR) as a novel representation form to model the diverse interests of each user. In addition, we use the InfoNCE from contrastive learning to avoid intractable computational costs due to the numerous size of item (token) vocabulary. Finally, we propose a set of representative downstream tasks to serve as a standard benchmark to evaluate the quality of the learned user and/or item embeddings, analogous to the GLUE benchmark in NLP domain. Our experimental results in these downstream tasks clearly show the comparative value of embeddings learned from our GUIM model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Product Knowledge Graph Embedding for E-commerce

In this paper, we propose a new product knowledge graph (PKG) embedding ...
research
02/24/2021

Theoretical Understandings of Product Embedding for E-commerce Machine Learning

Product embeddings have been heavily investigated in the past few years,...
research
04/26/2023

Self-Supervised Multi-Modal Sequential Recommendation

With the increasing development of e-commerce and online services, perso...
research
07/11/2022

Learning Large-scale Universal User Representation with Sparse Mixture of Experts

Learning user sequence behaviour embedding is very sophisticated and cha...
research
06/11/2021

Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation

A well-known limitation in pretrain-finetune paradigm lies in its inflex...
research
09/30/2019

Hotel2vec: Learning Attribute-Aware Hotel Embeddings with Self-Supervision

We propose a neural network architecture for learning vector representat...
research
06/30/2023

Of Spiky SVDs and Music Recommendation

The truncated singular value decomposition is a widely used methodology ...

Please sign up or login with your details

Forgot password? Click here to reset