Cross-Domain Product Representation Learning for Rich-Content E-Commerce

by   Xuehan Bai, et al.

The proliferation of short video and live-streaming platforms has revolutionized how consumers engage in online shopping. Instead of browsing product pages, consumers are now turning to rich-content e-commerce, where they can purchase products through dynamic and interactive media like short videos and live streams. This emerging form of online shopping has introduced technical challenges, as products may be presented differently across various media domains. Therefore, a unified product representation is essential for achieving cross-domain product recognition to ensure an optimal user search experience and effective product recommendations. Despite the urgent industrial need for a unified cross-domain product representation, previous studies have predominantly focused only on product pages without taking into account short videos and live streams. To fill the gap in the rich-content e-commerce area, in this paper, we introduce a large-scale cRoss-dOmain Product Ecognition dataset, called ROPE. ROPE covers a wide range of product categories and contains over 180,000 products, corresponding to millions of short videos and live streams. It is the first dataset to cover product pages, short videos, and live streams simultaneously, providing the basis for establishing a unified product representation across different media domains. Furthermore, we propose a Cross-dOmain Product rEpresentation framework, namely COPE, which unifies product representations in different domains through multimodal learning including text and vision. Extensive experiments on downstream tasks demonstrate the effectiveness of COPE in learning a joint feature space for all product domains.


page 1

page 9


Cross-view Semantic Alignment for Livestreaming Product Recognition

Live commerce is the act of selling products online through live streami...

DCDIR: A Deep Cross-Domain Recommendation System for Cold Start Users in Insurance Domain

Internet insurance products are apparently different from traditional e-...

e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce

Understanding vision and language representations of product content is ...

Adaptive Deep Learning of Cross-Domain Loss in Collaborative Filtering

Nowadays, users open multiple accounts on social media platforms and e-c...

Leveraging Tripartite Interaction Information from Live Stream E-Commerce for Improving Product Recommendation

Recently, a new form of online shopping becomes more and more popular, w...

Review Helpfulness Prediction with Embedding-Gated CNN

Product reviews, in the form of texts dominantly, significantly help con...

Image Matters: Detecting Offensive and Non-Compliant Content / Logo in Product Images

In e-commerce, product content, especially product images have a signifi...

Please sign up or login with your details

Forgot password? Click here to reset