Neural-based Cross-modal Search and Retrieval of Artwork

07/26/2023
by   Yan Gong, et al.
0

Creating an intelligent search and retrieval system for artwork images, particularly paintings, is crucial for documenting cultural heritage, fostering wider public engagement, and advancing artistic analysis and interpretation. Visual-Semantic Embedding (VSE) networks are deep learning models used for information retrieval, which learn joint representations of textual and visual data, enabling 1) cross-modal search and retrieval tasks, such as image-to-text and text-to-image retrieval; and 2) relation-focused retrieval to capture entity relationships and provide more contextually relevant search results. Although VSE networks have played a significant role in cross-modal information retrieval, their application to painting datasets, such as ArtUK, remains unexplored. This paper introduces BoonArt, a VSE-based cross-modal search engine that allows users to search for images using textual queries, and to obtain textual descriptions along with the corresponding images when using image queries. The performance of BoonArt was evaluated using the ArtUK dataset. Experimental evaluations revealed that BoonArt achieved 97 for image-to-text retrieval, and 97.4 By bridging the gap between textual and visual modalities, BoonArt provides a much-improved search performance compared to traditional search engines, such as the one provided by the ArtUK website. BoonArt can be utilised to work with other artwork datasets.

READ FULL TEXT

page 2

page 3

page 5

page 6

research
07/26/2023

Boon: A Neural Search Engine for Cross-Modal Information Retrieval

Visual-Semantic Embedding (VSE) networks can help search engines better ...
research
02/13/2023

CLIP-RR: Improved CLIP Network for Relation-Focused Cross-Modal Information Retrieval

Relation-focused cross-modal information retrieval focuses on retrieving...
research
05/23/2023

EDIS: Entity-Driven Image Search over Multimodal Web Content

Making image retrieval methods practical for real-world search applicati...
research
08/08/2021

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning

We introduce the task of open-vocabulary visual instance search (OVIS). ...
research
09/22/2021

Generating Compositional Color Representations from Text

We consider the cross-modal task of producing color representations for ...
research
11/17/2016

Generative One-Class Models for Text-based Person Retrieval in Forensic Applications

Automatic forensic image analysis assists criminal investigation experts...
research
12/05/2021

Gaudí: Conversational Interactions with Deep Representations to Generate Image Collections

Based on recent advances in realistic language modeling (GPT-3) and cros...

Please sign up or login with your details

Forgot password? Click here to reset