e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce

07/01/2022
by   Wonyoung Shin, et al.
0

Understanding vision and language representations of product content is vital for search and recommendation applications in e-commerce. As a backbone for online shopping platforms and inspired by the recent success in representation learning research, we propose a contrastive learning framework that aligns language and visual models using unlabeled raw product text and images. We present techniques we used to train large-scale representation learning models and share solutions that address domain-specific challenges. We study the performance using our pre-trained model as backbones for diverse downstream tasks, including category classification, attribute extraction, product matching, product clustering, and adult product recognition. Experimental results show that our proposed method outperforms the baseline in each downstream task regarding both single modality and multiple modalities.

READ FULL TEXT
research
05/22/2023

Efficient Large-Scale Vision Representation Learning

In this article, we present our approach to single-modality vision repre...
research
02/24/2021

Theoretical Understandings of Product Embedding for E-commerce Machine Learning

Product embeddings have been heavily investigated in the past few years,...
research
12/07/2022

Learning-To-Embed: Adopting Transformer based models for E-commerce Products Representation Learning

Learning low-dimensional representation for large number of products pre...
research
12/11/2022

A Study of Slang Representation Methods

Warning: this paper contains content that may be offensive or upsetting....
research
08/10/2023

Cross-Domain Product Representation Learning for Rich-Content E-Commerce

The proliferation of short video and live-streaming platforms has revolu...
research
06/13/2021

InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping

E-commerce companies have to face abnormal sellers who sell potentially-...
research
07/21/2023

DEFTri: A Few-Shot Label Fused Contextual Representation Learning For Product Defect Triage in e-Commerce

Defect Triage is a time-sensitive and critical process in a large-scale ...

Please sign up or login with your details

Forgot password? Click here to reset