Attribute Extraction from Product Titles in eCommerce

08/15/2016
by   Ajinkya More, et al.
0

This paper presents a named entity extraction system for detecting attributes in product titles of eCommerce retailers like Walmart. The absence of syntactic structure in such short pieces of text makes extracting attribute values a challenging problem. We find that combining sequence labeling algorithms such as Conditional Random Fields and Structured Perceptron with a curated normalization scheme produces an effective system for the task of extracting product attribute values from titles. To keep the discussion concrete, we will illustrate the mechanics of the system from the point of view of a particular attribute - brand. We also discuss the importance of an attribute extraction system in the context of retail websites with large product catalogs, compare our approach to other potential approaches to this problem and end the paper with a discussion of the performance of our system for extracting attributes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2022

Exploring Generative Models for Joint Attribute Value Extraction from Product Titles

Attribute values of the products are an essential component in any e-com...
research
03/09/2017

Information Extraction in Illicit Domains

Extracting useful entities and attribute values from illicit domains suc...
research
07/25/2023

Random (Un)rounding : Vulnerabilities in Discrete Attribute Disclosure in the 2021 Canadian Census

The 2021 Canadian census is notable for using a unique form of privacy, ...
research
06/01/2018

OpenTag: Open Attribute Value Extraction from Product Profiles

Extraction of missing attribute values is to find values describing an a...
research
01/08/2022

Extraction of Product Specifications from the Web – Going Beyond Tables and Lists

E-commerce product pages on the web often present product specification ...
research
10/31/2022

Automated Code Extraction from Discussion Board Text Dataset

This study introduces and investigates the capabilities of three differe...
research
06/12/2021

Scalable Approach for Normalizing E-commerce Text Attributes (SANTA)

In this paper, we present SANTA, a scalable framework to automatically n...

Please sign up or login with your details

Forgot password? Click here to reset