Show-and-Fool: Crafting Adversarial Examples for Neural Image Captioning

12/06/2017
by   Hongge Chen, et al.
0

Modern neural image captioning systems typically adopt the encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for caption generation. Inspired by the robustness analysis of CNN-based image classifiers to adversarial perturbations, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. Unlike image classification tasks with a finite set of class labels, finding visually-similar adversarial examples in an image captioning system is much more challenging since the space of possible captions in a captioning system is almost infinite. In this paper, we design three approaches for crafting adversarial examples in image captioning: (i) targeted caption method; (ii) targeted keyword method; and (iii) untargeted method. We formulate the process of finding adversarial perturbations as optimization problems and design novel loss functions for efficient search. Experimental results on the Show-and-Tell model and MSCOCO data set show that Show-and-Fool can successfully craft visually-similar adversarial examples with randomly targeted captions, and the adversarial examples can be made highly transferable to the Show-Attend-and-Tell model. Consequently, the presence of adversarial examples leads to new robustness implications of neural image captioning. To the best of our knowledge, this is the first work on crafting effective adversarial examples for image captioning tasks.

READ FULL TEXT

page 1

page 11

page 12

page 13

page 15

page 16

research
07/07/2021

Controlled Caption Generation for Images Through Adversarial Attacks

Deep learning is found to be vulnerable to adversarial examples. However...
research
06/13/2023

I See Dead People: Gray-Box Adversarial Attack on Image-To-Text Models

Modern image-to-text systems typically adopt the encoder-decoder framewo...
research
05/10/2019

Exact Adversarial Attack to Image Captioning via Structured Output Learning with Latent Variables

In this work, we study the robustness of a CNN+RNN based image captionin...
research
03/24/2018

CNN Based Adversarial Embedding with Minimum Alteration for Image Steganography

Historically, steganographic schemes were designed in a way to preserve ...
research
03/03/2018

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Crafting adversarial examples has become an important technique to evalu...
research
03/30/2016

Dense Image Representation with Spatial Pyramid VLAD Coding of CNN for Locally Robust Captioning

The workflow of extracting features from images using convolutional neur...
research
01/04/2022

Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety

There is amazing progress in Deep Learning based models for Image captio...

Please sign up or login with your details

Forgot password? Click here to reset