Multimodal Representation Model based on Graph-Based Rank Fusion

12/21/2019
by   Icaro Cavalcante Dourado, et al.
0

This paper proposes an unsupervised representation model, based on rank-fusion graphs, for general applicability in multimodal tasks, either unsupervised or supervised prediction. Rank-fusion graphs encode information from multiple descriptors and retrieval models, thus being able to capture underlying relationships between modalities, samples, and the collection itself. By doing so, our method is able to promote a fusion model better than either early-fusion and late-fusion alternatives. The solution is based on the encoding of multiple ranks of a query, defined according to different criteria, into a graph. Later, we embed the generated graph into a feature space, creating fusion vectors. Those embeddings are employed to build an estimator that infers whether an input (even multimodal) object refers to a class (or event) or not. Performed experiments in the context of multiple multimodal and visual datasets, evaluated on several descriptors and retrieval models, demonstrate that our representation model is highly effective for different detection scenarios involving visual, textual, and multimodal features, yielding better detection results than state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2019

Unsupervised Graph-based Rank Aggregation for Improved Retrieval

This paper presents a robust and comprehensive graph-based rank aggregat...
research
06/14/2019

Fusion vectors: Embedding Graph Fusions for Efficient Unsupervised Rank Aggregation

The vast increase in amount and complexity of digital content led to a w...
research
04/30/2019

Multimodal Classification of Urban Micro-Events

In this paper we seek methods to effectively detect urban micro-events. ...
research
04/17/2018

Deep Multimodal Subspace Clustering Networks

We present convolutional neural network (CNN) based approaches for unsup...
research
08/29/2023

A Multimodal Visual Encoding Model Aided by Introducing Verbal Semantic Information

Biological research has revealed that the verbal semantic information in...
research
05/13/2020

Towards Better Graph Representation: Two-Branch Collaborative Graph Neural Networks for Multimodal Marketing Intention Detection

Inspired by the fact that spreading and collecting information through t...
research
10/31/2018

Query Adaptive Late Fusion for Image Retrieval

Feature fusion is a commonly used strategy in image retrieval tasks, whi...

Please sign up or login with your details

Forgot password? Click here to reset