OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data

09/11/2023
by   Giuseppe Cartella, et al.
0

The inexorable growth of online shopping and e-commerce demands scalable and robust machine learning-based solutions to accommodate customer requirements. In the context of automatic tagging classification and multimodal retrieval, prior works either defined a low generalizable supervised learning approach or more reusable CLIP-based techniques while, however, training on closed source data. In this work, we propose OpenFashionCLIP, a vision-and-language contrastive learning method that only adopts open-source fashion data stemming from diverse domains, and characterized by varying degrees of specificity. Our approach is extensively validated across several tasks and benchmarks, and experimental results highlight a significant out-of-domain generalization capability and consistent improvements over state-of-the-art methods both in terms of accuracy and recall. Source code and trained models are publicly available at: https://github.com/aimagelab/open-fashion-clip.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

Domain-Unified Prompt Representations for Source-Free Domain Generalization

Domain generalization (DG), aiming to make models work on unseen domains...
research
02/26/2022

A Systematic Evaluation of Large Language Models of Code

Large language models (LMs) of code have recently shown tremendous promi...
research
08/11/2023

Deep Learning-Based Open Source Toolkit for Eosinophil Detection in Pediatric Eosinophilic Esophagitis

Eosinophilic Esophagitis (EoE) is a chronic, immune/antigen-mediated eso...
research
09/19/2017

A Fast and Accurate Vietnamese Word Segmenter

We propose a novel approach to Vietnamese word segmentation. Our approac...
research
12/14/2017

Rasa: Open Source Language Understanding and Dialogue Management

We introduce a pair of tools, Rasa NLU and Rasa Core, which are open sou...
research
04/04/2023

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Fashion illustration is used by designers to communicate their vision an...
research
10/08/2021

Temperature as Uncertainty in Contrastive Learning

Contrastive learning has demonstrated great capability to learn represen...

Please sign up or login with your details

Forgot password? Click here to reset