AmadeusGPT: a natural language interface for interactive animal behavioral analysis

by   Shaokai Ye, et al.

The process of quantifying and analyzing animal behavior involves translating the naturally occurring descriptive language of their actions into machine-readable code. Yet, codifying behavior analysis is often challenging without deep understanding of animal behavior and technical machine learning knowledge. To limit this gap, we introduce AmadeusGPT: a natural language interface that turns natural language descriptions of behaviors into machine-executable code. Large-language models (LLMs) such as GPT3.5 and GPT4 allow for interactive language-based queries that are potentially well suited for making interactive behavior analysis. However, the comprehension capability of these LLMs is limited by the context window size, which prevents it from remembering distant conversations. To overcome the context window limitation, we implement a novel dual-memory mechanism to allow communication between short-term and long-term memory using symbols as context pointers for retrieval and saving. Concretely, users directly use language-based definitions of behavior and our augmented GPT develops code based on the core AmadeusGPT API, which contains machine learning, computer vision, spatio-temporal reasoning, and visualization modules. Users then can interactively refine results, and seamlessly add new behavioral modules as needed. We benchmark AmadeusGPT and show we can produce state-of-the-art performance on the MABE 2022 behavior challenge tasks. Note, an end-user would not need to write any code to achieve this. Thus, collectively AmadeusGPT presents a novel way to merge deep biological knowledge, large-language models, and core computer vision modules into a more naturally intelligent system. Code and demos can be found at:


page 2

page 4

page 8

page 16

page 19


Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language

We propose LENS, a modular approach for tackling computer vision problem...

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

The fixed-size context of Transformer makes GPT models incapable of gene...

LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Systems that support users in the automatic creation of visualizations m...

Extending Memory for Language Modelling

Breakthroughs in deep learning and memory networks have made major advan...

"What It Wants Me To Say": Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models

Code-generating large language models translate natural language into co...

Trusting RoBERTa over BERT: Insights from CheckListing the Natural Language Inference Task

The recent state-of-the-art natural language understanding (NLU) systems...

Memory-Augmented LLM Personalization with Short- and Long-Term Memory Coordination

Large Language Models (LLMs), such as GPT3.5, have exhibited remarkable ...

Please sign up or login with your details

Forgot password? Click here to reset