Log In Sign Up

Interpretable Methods for Identifying Product Variants

by   Rebecca West, et al.

For e-commerce companies with large product selections, the organization and grouping of products in meaningful ways is important for creating great customer shopping experiences and cultivating an authoritative brand image. One important way of grouping products is to identify a family of product variants, where the variants are mostly the same with slight and yet distinct differences (e.g. color or pack size). In this paper, we introduce a novel approach to identifying product variants. It combines both constrained clustering and tailored NLP techniques (e.g. extraction of product family name from unstructured product title and identification of products with similar model numbers) to achieve superior performance compared with an existing baseline using a vanilla classification approach. In addition, we design the algorithm to meet certain business criteria, including meeting high accuracy requirements on a wide range of categories (e.g. appliances, decor, tools, and building materials, etc.) as well as prioritizing the interpretability of the model to make it accessible and understandable to all business partners.


Vietnamese Open-domain Complaint Detection in E-Commerce Websites

Customer product reviews play a role in improving the quality of product...

A Modern Approach to Integrate Database Queries for Searching E-Commerce Product

E commerce refers to the utilization of electronic data transmission for...

The Identification and Estimation of Direct and Indirect Effects in A/B Tests through Causal Mediation Analysis

E-commerce companies have a number of online products, such as organic s...

Automatic Generation of Product Concepts from Positive Examples, with an Application to Music Streaming

Internet based businesses and products (e.g. e-commerce, music streaming...

A Latent-class Model for Estimating Product-choice Probabilities from Clickstream Data

This paper analyzes customer product-choice behavior based on the recenc...