Concept Algebra for Text-Controlled Vision Models

02/07/2023
by   Zihao Wang, et al.
0

This paper concerns the control of text-guided generative models, where a user provides a natural language prompt and the model generates samples based on this input. Prompting is intuitive, general, and flexible. However, there are significant limitations: prompting can fail in surprising ways, and it is often unclear how to find a prompt that will elicit some desired target behavior. A core difficulty for developing methods to overcome these issues is that failures are know-it-when-you-see-it – it's hard to fix bugs if you can't state precisely what the model should have done! In this paper, we introduce a formalization of "what the user intended" in terms of latent concepts implicit to the data generating process that the model was trained on. This formalization allows us to identify some fundamental limitations of prompting. We then use the formalism to develop concept algebra to overcome these limitations. Concept algebra is a way of directly manipulating the concepts expressed in the output through algebraic operations on a suitably defined representation of input prompts. We give examples using concept algebra to overcome limitations of prompting, including concept transfer through arithmetic, and concept nullification through projection. Code available at https://github.com/zihao12/concept-algebra.

READ FULL TEXT

page 2

page 6

page 9

page 11

page 12

page 22

page 23

page 24

research
11/03/2017

CGAlgebra: a Mathematica package for conformal geometric algebra

A tutorial of the Mathematica package CGAlgebra, for conformal geometric...
research
04/09/2019

Mixing syntagmatic and paradigmatic information for concept detection

In the last decades, philosophers have begun using empirical data for co...
research
08/02/2022

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Text-to-image models offer unprecedented freedom to guide creation throu...
research
06/10/2021

AGGGEN: Ordering and Aggregating while Generating

We present AGGGEN (pronounced 'again'), a data-to-text model which re-in...
research
06/01/2023

The Hidden Language of Diffusion Models

Text-to-image diffusion models have demonstrated an unparalleled ability...
research
05/21/2022

Exploring Concept Contribution Spatially: Hidden Layer Interpretation with Spatial Activation Concept Vector

To interpret deep learning models, one mainstream is to explore the lear...
research
01/26/2018

Neural Algebra of Classifiers

The world is fundamentally compositional, so it is natural to think of v...

Please sign up or login with your details

Forgot password? Click here to reset