ViperGPT: Visual Inference via Python Execution for Reasoning

03/14/2023
by   Dídac Surís, et al.
0

Answering visual queries is a complex task that requires both visual processing and reasoning. End-to-end models, the dominant approach for this task, do not explicitly differentiate between the two, limiting interpretability and generalization. Learning modular programs presents a promising alternative, but has proven challenging due to the difficulty of learning both the programs and modules simultaneously. We introduce ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query. ViperGPT utilizes a provided API to access the available modules, and composes them by generating Python code that is later executed. This simple approach requires no further training, and achieves state-of-the-art results across various complex visual tasks.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 8

page 12

page 17

page 18

research
06/08/2023

Modular Visual Question Answering via Code Generation

We present a framework that formulates visual question answering as modu...
research
11/18/2022

Visual Programming: Compositional visual reasoning without training

We present VISPROG, a neuro-symbolic approach to solving complex and com...
research
10/06/2022

Binding Language Models in Symbolic Languages

Though end-to-end neural approaches have recently been dominating NLP ta...
research
03/14/2018

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Visual question answering requires high-order reasoning about an image, ...
research
08/31/2023

Experimenting with ChatGPT for Spreadsheet Formula Generation: Evidence of Risk in AI Generated Spreadsheets

Large Language Models (LLM) have become sophisticated enough that comple...
research
02/28/2021

PyCG: Practical Call Graph Generation in Python

Call graphs play an important role in different contexts, such as profil...
research
10/02/2018

A Knowledge Hunting Framework for Common Sense Reasoning

We introduce an automatic system that achieves state-of-the-art results ...

Please sign up or login with your details

Forgot password? Click here to reset