Cycled Compositional Learning between Images and Text

07/24/2021
by   Jongseok Kim, et al.
0

We present an approach named the Cycled Composition Network that can measure the semantic distance of the composition of image-text embedding. First, the Composition Network transit a reference image to target image in an embedding space using relative caption. Second, the Correction Network calculates a difference between reference and retrieved target images in the embedding space and match it with a relative caption. Our goal is to learn a Composition mapping with the Composition Network. Since this one-way mapping is highly under-constrained, we couple it with an inverse relation learning with the Correction Network and introduce a cycled relation for given Image We participate in Fashion IQ 2020 challenge and have won the first place with the ensemble of our model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2020

CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data

We present an approach named CurlingNet that can measure the semantic di...
research
03/15/2023

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Diffusion models have shown superior performance in image generation and...
research
09/14/2020

VC-Net: Deep Volume-Composition Networks for Segmentation and Visualization of Highly Sparse and Noisy Image Data

The motivation of our work is to present a new visualization-guided comp...
research
07/13/2020

Fashion-IQ 2020 Challenge 2nd Place Team's Solution

This paper is dedicated to team VAA's approach submitted to the Fashion-...
research
11/23/2022

Space-efficient RLZ-to-LZ77 conversion

Consider a text T [1..n] prefixed by a reference sequence R = T [1..ℓ]. ...
research
04/07/2021

RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

In this paper, we study the compositional learning of images and texts f...
research
10/28/2021

BERTian Poetics: Constrained Composition with Masked LMs

Masked language models have recently been interpreted as energy-based se...

Please sign up or login with your details

Forgot password? Click here to reset