Recurrent Multimodal Interaction for Referring Image Segmentation

03/23/2017
βˆ™
by   Chenxi Liu, et al.
βˆ™
0
βˆ™

In this paper we are interested in the problem of image segmentation given natural language descriptions, i.e. referring expressions. Existing works tackle this problem by first modeling images and sentences independently and then segment images by combining these two types of representations. We argue that learning word-to-image interaction is more native in the sense of jointly modeling two modalities for the image segmentation task, and we propose convolutional multimodal LSTM to encode the sequential interactions between individual words, visual information, and spatial information. We show that our proposed model outperforms the baseline model on benchmark datasets. In addition, we analyze the intermediate output of the proposed multimodal LSTM approach and empirically explain how this approach enforces a more effective word-to-image interaction.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
βˆ™ 01/30/2020

Dual Convolutional LSTM Network for Referring Image Segmentation

We consider referring image segmentation. It is a problem at the interse...
research
βˆ™ 10/01/2020

Linguistic Structure Guided Context Modeling for Referring Image Segmentation

Referring image segmentation aims to predict the foreground mask of the ...
research
βˆ™ 10/04/2014

Explain Images with Multimodal Recurrent Neural Networks

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) ...
research
βˆ™ 04/21/2021

Comprehensive Multi-Modal Interactions for Referring Image Segmentation

We investigate Referring Image Segmentation (RIS), which outputs a segme...
research
βˆ™ 03/28/2020

BiLingUNet: Image Segmentation by Modulating Top-Down and Bottom-Up Visual Processing with Referring Expressions

We present BiLingUNet, a state-of-the-art model for image segmentation u...
research
βˆ™ 01/05/2021

Deep Class-Specific Affinity-Guided Convolutional Network for Multimodal Unpaired Image Segmentation

Multi-modal medical image segmentation plays an essential role in clinic...
research
βˆ™ 03/20/2016

Segmentation from Natural Language Expressions

In this paper we approach the novel problem of segmenting an image based...

Please sign up or login with your details

Forgot password? Click here to reset