Embracing Structure in Data for Billion-Scale Semantic Product Search

10/12/2021
by   Vihan Lakshman, et al.
0

We present principled approaches to train and deploy dyadic neural embedding models at the billion scale, focusing our investigation on the application of semantic product search. When training a dyadic model, one seeks to embed two different types of entities (e.g., queries and documents or users and movies) in a common vector space such that pairs with high relevance are positioned nearby. During inference, given an embedding of one type (e.g., a query or a user), one seeks to retrieve the entities of the other type (e.g., documents or movies, respectively) that are highly relevant. In this work, we show that exploiting the natural structure of real-world datasets helps address both challenges efficiently. Specifically, we model dyadic data as a bipartite graph with edges between pairs with positive associations. We then propose to partition this network into semantically coherent clusters and thus reduce our search space by focusing on a small subset of these partitions for a given input. During training, this technique enables us to efficiently mine hard negative examples while, at inference, we can quickly find the nearest neighbors for a given embedding. We provide offline experimental results that demonstrate the efficacy of our techniques for both training and inference on a billion-scale Amazon.com product search dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2019

Semantic Product Search

We study the problem of semantic matching in product search, that is, gi...
research
08/30/2019

Learning to Ask: Question-based Sequential Bayesian Product Search

Product search is generally recognized as the first and foremost stage o...
research
06/17/2021

Embedding-based Product Retrieval in Taobao Search

Nowadays, the product search service of e-commerce platforms has become ...
research
04/07/2021

Distantly Supervised Transformers For E-Commerce Product QA

We propose a practical instant question answering (QA) system on product...
research
06/23/2021

Extreme Multi-label Learning for Semantic Matching in Product Search

We consider the problem of semantic matching in product search: given a ...
research
09/12/2022

An Embedding-Based Grocery Search Model at Instacart

The key to e-commerce search is how to best utilize the large yet noisy ...

Please sign up or login with your details

Forgot password? Click here to reset