Constructing Hierarchical Image-tags Bimodal Representations for Word Tags Alternative Choice

07/04/2013
by   Fangxiang Feng, et al.
0

This paper describes our solution to the multi-modal learning challenge of ICML. This solution comprises constructing three-level representations in three consecutive stages and choosing correct tag words with a data-specific strategy. Firstly, we use typical methods to obtain level-1 representations. Each image is represented using MPEG-7 and gist descriptors with additional features released by the contest organizers. And the corresponding word tags are represented by bag-of-words model with a dictionary of 4000 words. Secondly, we learn the level-2 representations using two stacked RBMs for each modality. Thirdly, we propose a bimodal auto-encoder to learn the similarities/dissimilarities between the pairwise image-tags as level-3 representations. Finally, during the test phase, based on one observation of the dataset, we come up with a data-specific strategy to choose the correct tag words leading to a leap of an improved overall performance. Our final average accuracy on the private test set is 100 challenge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2018

Audio Based Disambiguation Of Music Genre Tags

In this paper, we propose to infer music genre embeddings from audio dat...
research
02/15/2022

Unsupervised word-level prosody tagging for controllable speech synthesis

Although word-level prosody modeling in neural text-to-speech (TTS) has ...
research
09/17/2014

Adaptive Tag Selection for Image Annotation

Not all tags are relevant to an image, and the number of relevant tags i...
research
06/25/2016

Finding the Topic of a Set of Images

In this paper we introduce the problem of determining the topic that a s...
research
01/12/2016

Learning Subclass Representations for Visually-varied Image Classification

In this paper, we present a subclass-representation approach that predic...
research
10/14/2021

Tagged Documents Co-Clustering

Tags are short sequences of words allowing to describe textual and non-t...
research
06/24/2022

Deep embedded clustering algorithm for clustering PACS repositories

Creating large datasets of medical radiology images from several sources...

Please sign up or login with your details

Forgot password? Click here to reset