Evaluating Multimodal Representations on Sentence Similarity: vSTS, Visual Semantic Textual Similarity Dataset

09/11/2018
by   Oier Lopez de Lacalle, et al.
0

In this paper we introduce vSTS, a new dataset for measuring textual similarity of sentences using multimodal information. The dataset is comprised by images along with its respectively textual captions. We describe the dataset both quantitatively and qualitatively, and claim that it is a valid gold standard for measuring automatic multimodal textual similarity systems. We also describe the initial experiments combining the multimodal information.

READ FULL TEXT

page 1

page 2

page 3

research
04/04/2020

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

The combination of visual and textual representations has produced excel...
research
05/22/2022

The Case for Perspective in Multimodal Datasets

This paper argues in favor of the adoption of annotation practices for m...
research
04/18/2018

Quantifying the visual concreteness of words and topics in multimodal datasets

Multimodal machine learning algorithms aim to learn visual-textual corre...
research
06/30/2018

The Historical Significance of Textual Distances

Measuring similarity is a basic task in information retrieval, and now o...
research
04/16/2019

Unsupervised Discovery of Multimodal Links in Multi-Image, Multi-Sentence Documents

Images and text co-occur everywhere on the web, but explicit links betwe...
research
09/23/2020

Cosine Similarity of Multimodal Content Vectors for TV Programmes

Multimodal information originates from a variety of sources: audiovisual...
research
08/19/2021

Czech News Dataset for Semantic Textual Similarity

This paper describes a novel dataset consisting of sentences with semant...

Please sign up or login with your details

Forgot password? Click here to reset