OPI at SemEval 2023 Task 1: Image-Text Embeddings and Multimodal Information Retrieval for Visual Word Sense Disambiguation

04/14/2023
by   Sławomir Dadas, et al.
0

The goal of visual word sense disambiguation is to find the image that best matches the provided description of the word's meaning. It is a challenging problem, requiring approaches that combine language and image understanding. In this paper, we present our submission to SemEval 2023 visual word sense disambiguation shared task. The proposed system integrates multimodal embeddings, learning to rank methods, and knowledge-based approaches. We build a classifier based on the CLIP model, whose results are enriched with additional information retrieved from Wikipedia and lexical databases. Our solution was ranked third in the multilingual task and won in the Persian track, one of the three language subtasks.

READ FULL TEXT
research
03/30/2016

Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings

We introduce a new task, visual sense disambiguation for verbs: given an...
research
06/24/2019

LIAAD at SemDeep-5 Challenge: Word-in-Context (WiC)

This paper describes the LIAAD system that was ranked second place in th...
research
06/24/2023

UAlberta at SemEval-2023 Task 1: Context Augmentation and Translation for Multilingual Visual Word Sense Disambiguation

We describe the systems of the University of Alberta team for the SemEva...
research
10/19/2017

Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description

We present the results from the second shared task on multimodal machine...
research
01/14/2016

Linear Algebraic Structure of Word Senses, with Applications to Polysemy

Word embeddings are ubiquitous in NLP and information retrieval, but it'...
research
01/25/2021

PolyLM: Learning about Polysemy through Language Modeling

To avoid the "meaning conflation deficiency" of word embeddings, a numbe...
research
03/25/2017

Learning to Predict: A Fast Re-constructive Method to Generate Multimodal Embeddings

Integrating visual and linguistic information into a single multimodal r...

Please sign up or login with your details

Forgot password? Click here to reset