MAKE: Vision-Language Pre-training based Product Retrieval in Taobao Search

01/30/2023
by   Xiaoyang Zheng, et al.
0

Taobao Search consists of two phases: the retrieval phase and the ranking phase. Given a user query, the retrieval phase returns a subset of candidate products for the following ranking phase. Recently, the paradigm of pre-training and fine-tuning has shown its potential in incorporating visual clues into retrieval tasks. In this paper, we focus on solving the problem of text-to-multimodal retrieval in Taobao Search. We consider that users' attention on titles or images varies on products. Hence, we propose a novel Modal Adaptation module for cross-modal fusion, which helps assigns appropriate weights on texts and images across products. Furthermore, in e-commerce search, user queries tend to be brief and thus lead to significant semantic imbalance between user queries and product titles. Therefore, we design a separate text encoder and a Keyword Enhancement mechanism to enrich the query representations and improve text-to-multimodal matching. To this end, we present a novel vision-language (V+L) pre-training methods to exploit the multimodal information of (user query, product title, product image). Extensive experiments demonstrate that our retrieval-specific pre-training model (referred to as MAKE) outperforms existing V+L pre-training methods on the text-to-multimodal retrieval task. MAKE has been deployed online and brings major improvements on the retrieval system of Taobao Search.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2023

Delving into E-Commerce Product Retrieval with Vision-Language Pre-training

E-commerce search engines comprise a retrieval phase and a ranking phase...
research
07/17/2021

Neural Search: Learning Query and Product Representations in Fashion E-commerce

Typical e-commerce platforms contain millions of products in the catalog...
research
02/10/2023

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Same-style products retrieval plays an important role in e-commerce plat...
research
06/25/2023

Enhancing Dynamic Image Advertising with Vision-Language Pre-training

In the multimedia era, image is an effective medium in search advertisin...
research
12/14/2021

ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Nowadays on E-commerce platforms, products are presented to the customer...
research
11/15/2018

Boosting Search Performance Using Query Variations

Rank fusion is a powerful technique that allows multiple sources of info...
research
02/15/2022

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval

We introduce CommerceMM - a multimodal model capable of providing a dive...

Please sign up or login with your details

Forgot password? Click here to reset