AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation

04/04/2023
by   Jheng-Hong Yang, et al.
0

This paper presents the AToMiC (Authoring Tools for Multimedia Content) dataset, designed to advance research in image/text cross-modal retrieval. While vision-language pretrained transformers have led to significant improvements in retrieval effectiveness, existing research has relied on image-caption datasets that feature only simplistic image-text relationships and underspecified user models of retrieval tasks. To address the gap between these oversimplified settings and real-world applications for multimedia content creation, we introduce a new approach for building retrieval test collections. We leverage hierarchical structures and diverse domains of texts, styles, and types of images, as well as large-scale image-document associations embedded in Wikipedia. We formulate two tasks based on a realistic user model and validate our dataset through retrieval experiments using baseline models. AToMiC offers a testbed for scalable, diverse, and reproducible multimedia retrieval research. Finally, the dataset provides the basis for a dedicated track at the 2023 Text Retrieval Conference (TREC), and is publicly available at https://github.com/TREC-AToMiC/AToMiC.

READ FULL TEXT
research
04/20/2023

Image-text Retrieval via preserving main Semantics of Vision

Image-text retrieval is one of the major tasks of cross-modal retrieval....
research
08/08/2023

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval

Most existing cross-modal retrieval methods employ two-stream encoders w...
research
07/18/2012

Content Based Multimedia Information Retrieval to Support Digital Libraries

Content-based multimedia information retrieval is an interesting researc...
research
01/30/2018

The New Modality: Emoji Challenges in Prediction, Anticipation, and Retrieval

Over the past decade, emoji have emerged as a new and widespread form of...
research
04/06/2023

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval

Cross-modal retrieval methods are the preferred tool to search databases...
research
02/18/2021

Hierarchical Similarity Learning for Language-based Product Image Retrieval

This paper aims for the language-based product image retrieval task. The...
research
03/10/2017

A New Evaluation Protocol and Benchmarking Results for Extendable Cross-media Retrieval

This paper proposes a new evaluation protocol for cross-media retrieval ...

Please sign up or login with your details

Forgot password? Click here to reset