Adversarial Prompting for Black Box Foundation Models

02/08/2023
by   Natalie Maus, et al.
0

Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or biasing the frequency of specific letters in the generated text.

READ FULL TEXT

page 2

page 9

page 17

page 18

page 19

page 20

page 21

page 22

research
10/21/2020

Black-Box Ripper: Copying black-box models using generative evolutionary algorithms

We study the task of replicating the functionality of black-box neural m...
research
06/22/2018

xGEMs: Generating Examplars to Explain Black-Box Models

This work proposes xGEMs or manifold guided exemplars, a framework to un...
research
09/20/2019

Creative GANs for generating poems, lyrics, and metaphors

Generative models for text have substantially contributed to tasks like ...
research
07/27/2023

Evaluating Generative Models for Graph-to-Text Generation

Large language models (LLMs) have been widely employed for graph-to-text...
research
11/07/2022

Proper losses for discrete generative models

We initiate the study of proper losses for evaluating generative models ...
research
05/14/2023

Watermarking Text Generated by Black-Box Language Models

LLMs now exhibit human-like skills in various fields, leading to worries...
research
06/06/2020

A Generic and Model-Agnostic Exemplar Synthetization Framework for Explainable AI

With the growing complexity of deep learning methods adopted in practica...

Please sign up or login with your details

Forgot password? Click here to reset