Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner

05/02/2017
by   Tseng-Hung Chen, et al.
0

Impressive image captioning results are achieved in domains with plenty of training image and sentence pairs (e.g., MSCOCO). However, transferring to a target domain with significant domain shifts but no paired training data (referred to as cross-domain image captioning) remains largely unexplored. We propose a novel adversarial training procedure to leverage unpaired data in the target domain. Two critic networks are introduced to guide the captioner, namely domain critic and multi-modal critic. The domain critic assesses whether the generated sentences are indistinguishable from sentences in the target domain. The multi-modal critic assesses whether an image and its generated sentence are a valid pair. During training, the critics and captioner act as adversaries -- captioner aims to generate indistinguishable sentences, whereas critics aim at distinguishing them. The assessment improves the captioner through policy gradient updates. During inference, we further propose a novel critic-based planning method to select high-quality sentences without additional supervision (e.g., tags). To evaluate, we use MSCOCO as the source domain and four other datasets (CUB-200-2011, Oxford-102, TGIF, and Flickr30k) as the target domains. Our method consistently performs well on all datasets. In particular, on CUB-200-2011, we achieve 21.8 adaptation. Utilizing critics during inference further gives another 4.5 boost.

READ FULL TEXT

page 1

page 4

page 8

research
02/11/2022

Cross Domain Few-Shot Learning via Meta Adversarial Training

Few-shot relation classification (RC) is one of the critical problems in...
research
07/16/2020

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Fully supervised neural approaches have achieved significant progress in...
research
02/28/2023

Self-training through Classifier Disagreement for Cross-Domain Opinion Target Extraction

Opinion target extraction (OTE) or aspect extraction (AE) is a fundament...
research
04/09/2020

Recommendation Chart of Domains for Cross-Domain Sentiment Analysis:Findings of A 20 Domain Study

Cross-domain sentiment analysis (CDSA) helps to address the problem of d...
research
07/23/2019

GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition

Recent adversarial learning research has achieved very impressive progre...
research
01/13/2020

Memorizing Comprehensively to Learn Adaptively: Unsupervised Cross-Domain Person Re-ID with Multi-level Memory

Unsupervised cross-domain person re-identification (Re-ID) aims to adapt...
research
11/05/2017

Adversarial Dropout Regularization

We present a method for transferring neural representations from label-r...

Please sign up or login with your details

Forgot password? Click here to reset