How to Read Paintings: Semantic Art Understanding with Multi-Modal Retrieval

10/23/2018
by   Noa Garcia, et al.
2

Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5 remarkable levels of art understanding when compared against human evaluation.

READ FULL TEXT

page 2

page 12

page 13

research
04/24/2019

Understanding Art through Multi-Modal Retrieval in Paintings

In computer vision, visual arts are often studied from a purely aestheti...
research
05/16/2023

Multi-modal Visual Understanding with Prompts for Semantic Information Disentanglement of Image

Multi-modal visual understanding of images with prompts involves using v...
research
01/29/2019

Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues

Citizen engagement and technology usage are two emerging trends driven b...
research
09/21/2020

Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval

Scene text instances found in natural images carry explicit semantic inf...
research
12/03/2022

Named Entity and Relation Extraction with Multi-Modal Retrieval

Multi-modal named entity recognition (NER) and relation extraction (RE) ...
research
08/22/2022

Revising Image-Text Retrieval via Multi-Modal Entailment

An outstanding image-text retrieval model depends on high-quality labele...
research
03/31/2022

A Rich Recipe Representation as Plan to Support Expressive Multi Modal Queries on Recipe Content and Preparation Process

Food is not only a basic human necessity but also a key factor driving a...

Please sign up or login with your details

Forgot password? Click here to reset