Multi-output Headed Ensembles for Product Item Classification

07/29/2023
by   Hotaka Shiokawa, et al.
0

In this paper, we revisit the problem of product item classification for large-scale e-commerce catalogs. The taxonomy of e-commerce catalogs consists of thousands of genres to which are assigned items that are uploaded by merchants on a continuous basis. The genre assignments by merchants are often wrong but treated as ground truth labels in automatically generated training sets, thus creating a feedback loop that leads to poorer model quality over time. This problem of taxonomy classification becomes highly pronounced due to the unavailability of sizable curated training sets. Under such a scenario it is common to combine multiple classifiers to combat poor generalization performance from a single classifier. We propose an extensible deep learning based classification model framework that benefits from the simplicity and robustness of averaging ensembles and fusion based classifiers. We are also able to use metadata features and low-level feature engineering to boost classification performance. We show these improvements against robust industry standard baseline models that employ hyperparameter optimization. Additionally, due to continuous insertion, deletion and updates to real-world high-volume e-commerce catalogs, assessing model performance for deployment using A/B testing and/or manual annotation becomes a bottleneck. To this end, we also propose a novel way to evaluate model performance using user sessions that provides better insights in addition to traditional measures of precision and recall.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2020

Large-scale Real-time Personalized Similar Product Recommendations

Similar product recommendation is one of the most common scenes in e-com...
research
12/14/2018

Don't Classify, Translate: Multi-Level E-Commerce Product Categorization Via Machine Translation

E-commerce platforms categorize their products into a multi-level taxono...
research
06/20/2016

Product Classification in E-Commerce using Distributional Semantics

Product classification is the task of automatically predicting a taxonom...
research
04/15/2020

TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories

Extracting structured knowledge from product profiles is crucial for var...
research
11/04/2022

Continuous Prompt Tuning Based Textual Entailment Model for E-commerce Entity Typing

The explosion of e-commerce has caused the need for processing and analy...
research
06/19/2020

Analyzing the Real-World Applicability of DGA Classifiers

Separating benign domains from domains generated by DGAs with the help o...
research
12/17/2018

Deep Heterogeneous Autoencoders for Collaborative Filtering

This paper leverages heterogeneous auxiliary information to address the ...

Please sign up or login with your details

Forgot password? Click here to reset