Diachronic Cross-modal Embeddings

09/30/2019
by   David Semedo, et al.
0

Understanding the semantic shifts of multimodal information is only possible with models that capture cross-modal interactions over time. Under this paradigm, a new embedding is needed that structures visual-textual interactions according to the temporal dimension, thus, preserving data's original temporal organisation. This paper introduces a novel diachronic cross-modal embedding (DCM), where cross-modal correlations are represented in embedding space, throughout the temporal dimension, preserving semantic similarity at each instant t. To achieve this, we trained a neural cross-modal architecture, under a novel ranking loss strategy, that for each multimodal instance, enforces neighbour instances' temporal alignment, through subspace structuring constraints based on a temporal alignment window. Experimental results show that our DCM embedding successfully organises instances over time. Quantitative experiments, confirm that DCM is able to preserve semantic cross-modal correlations at each instant t while also providing better alignment capabilities. Qualitative experiments unveil new ways to browse multimodal content and hint that multimodal understanding tasks can benefit from this new embedding.

READ FULL TEXT

page 2

page 7

page 8

research
07/19/2018

Revisiting Cross Modal Retrieval

This paper proposes a cross-modal retrieval system that leverages on ima...
research
10/13/2020

Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!

Modeling expressive cross-modal interactions seems crucial in multimodal...
research
10/18/2017

Learning Social Image Embedding with Deep Multimodal Attention Networks

Learning social media data embedding by deep models has attracted extens...
research
09/30/2019

Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints

Cross-modal embeddings, between textual and visual modalities, aim to or...
research
10/15/2019

Target-Oriented Deformation of Visual-Semantic Embedding Space

Multimodal embedding is a crucial research topic for cross-modal underst...
research
08/22/2023

DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment

Cross-modal garment synthesis and manipulation will significantly benefi...
research
09/06/2022

Cross Modal Compression: Towards Human-comprehensible Semantic Compression

Traditional image/video compression aims to reduce the transmission/stor...

Please sign up or login with your details

Forgot password? Click here to reset