Conditional Image-Text Embedding Networks

11/22/2017
by   Bryan A. Plummer, et al.
0

This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model. In order to differentiate text phrases into semantically distinct subspaces, we propose a concept weight branch that automatically assigns phrases to embeddings, whereas prior works predefine such assignments. Our proposed solution simplifies the representation requirements for individual embeddings and allows the underrepresented concepts to take advantage of the shared representations before feeding them into concept-specific layers. Comprehensive experiments verify the effectiveness of our approach across three phrase grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, where we obtain a (resp.) 3.5 over a strong region-phrase embedding baseline.

READ FULL TEXT

page 2

page 7

research
03/18/2019

Neural Sequential Phrase Grounding (SeqGROUND)

We propose an end-to-end approach for phrase grounding in images. Unlike...
research
04/02/2016

Discriminative Phrase Embedding for Paraphrase Identification

This work, concerning paraphrase identification task, on one hand contri...
research
05/03/2017

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures

We propose a weakly-supervised approach that takes image-sentence pairs ...
research
09/13/2020

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

In this paper, we present a simple baseline for visual grounding for aut...
research
03/27/2019

Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment

We address the problem of grounding free-form textual phrases by using w...
research
02/01/2019

CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

Open Information Extraction (OpenIE) methods extract (noun phrase, relat...
research
05/25/2016

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

In this paper, we propose a bidimensional attention based recursive auto...

Please sign up or login with your details

Forgot password? Click here to reset