Multi-Modal Attribute Extraction for E-Commerce

03/07/2022
by   Aloïs De la Comble, et al.
0

To improve users' experience as they navigate the myriad of options offered by online marketplaces, it is essential to have well-organized product catalogs. One key ingredient to that is the availability of product attributes such as color or material. However, on some marketplaces such as Rakuten-Ichiba, which we focus on, attribute information is often incomplete or even missing. One promising solution to this problem is to rely on deep models pre-trained on large corpora to predict attributes from unstructured data, such as product descriptive texts and images (referred to as modalities in this paper). However, we find that achieving satisfactory performance with this approach is not straightforward but rather the result of several refinements, which we discuss in this paper. We provide a detailed description of our approach to attribute extraction, from investigating strong single-modality methods, to building a solid multimodal model combining textual and visual information. One key component of our multimodal architecture is a novel approach to seamlessly combine modalities, which is inspired by our single-modality investigations. In practice, we notice that this new modality-merging method may suffer from a modality collapse issue, i.e., it neglects one modality. Hence, we further propose a mitigation to this problem based on a principled regularization scheme. Experiments on Rakuten-Ichiba data provide empirical evidence for the benefits of our approach, which has been also successfully deployed to Rakuten-Ichiba. We also report results on publicly available datasets showing that our model is competitive compared to several recent multimodal and unimodal baselines.

READ FULL TEXT

page 5

page 8

research
12/21/2021

Extending CLIP for Category-to-image Retrieval in E-commerce

E-commerce provides rich multimodal data that is barely leveraged in pra...
research
06/01/2023

PV2TEA: Patching Visual Modality to Textual-Established Information Extraction

Information extraction, e.g., attribute value extraction, has been exten...
research
11/29/2017

Multimodal Attribute Extraction

The broad goal of information extraction is to derive structured informa...
research
06/08/2021

PAM: Understanding Product Images in Cross Product Category Attribute Extraction

Understanding product attributes plays an important role in improving on...
research
07/07/2022

Multimodal E-Commerce Product Classification Using Hierarchical Fusion

In this work, we present a multi-modal model for commercial product clas...
research
07/15/2022

Boosting Multi-Modal E-commerce Attribute Value Extraction via Unified Learning Scheme and Dynamic Range Minimization

With the prosperity of e-commerce industry, various modalities, e.g., vi...
research
07/21/2022

Unimodal vs. Multimodal Siamese Networks for Outfit Completion

The popularity of online fashion shopping continues to grow. The ability...

Please sign up or login with your details

Forgot password? Click here to reset