Improvised Comedy as a Turing Test

by   Kory Wallace Mathewson, et al.
IEEE Computer Society

The best improvisational theatre actors can make any scene partner, of any skill level or ability, appear talented and proficient in the art form, and thus "make them shine". To challenge this improvisational paradigm, we built an artificial intelligence (AI) trained to perform live shows alongside human actors for human audiences. Over the course of 30 performances to a combined audience of almost 3000 people, we have refined theatrical games which involve combinations of human and (at times, adversarial) AI actors. We have developed specific scene structures to include audience participants in interesting ways. Finally, we developed a complete show structure that submitted the audience to a Turing test and observed their suspension of disbelief, which we believe is key for human/non-human theatre co-creation.


page 3

page 4


Aligning Superhuman AI and Human Behavior: Chess as a Model System

As artificial intelligence becomes increasingly intelligent—in some case...

A Test for Evaluating Performance in Human-Computer Systems

The Turing test for comparing computer performance to that of humans is ...

Evaluation of the Handshake Turing Test for anthropomorphic Robots

Handshakes are fundamental and common greeting and parting gestures amon...

Improbotics: Exploring the Imitation Game using Machine Intelligence in Improvised Theatre

Theatrical improvisation (impro or improv) is a demanding form of live, ...

Toward a Human-Level Video Understanding Intelligence

We aim to develop an AI agent that can watch video clips and have a conv...

Niépce-Bell or Turing: How to Test Odor Reproduction?

In a 1950 article in Mind, decades before the existence of anything rese...

Post Selections Using Test Sets (PSUTS) and How Developmental Networks Avoid Them

This paper raises a rarely reported practice in Artificial Intelligence ...

1 Background

Theatrical improvisation is a form of live theatre where artists perform "real-time dynamic problem solving" through semi-structured spontaneous storytelling Magerko et al. (2009). Improvised comedy involves both performers and audience members in interactive formats. We present explorations in a theatrical Turing Test as part of an improvised comedy show. We have developed an artificial intelligence-based improvisational theatre actor—a chatbot with speech recognition and speech synthesis, with a physical humanoid robot embodiment Perlin and Goldberg (1996); Hayes-Roth and Van Gent (1996) and performed alongside it in improv showsAAAShow listings and recordings are available at at performing arts festivals, including ImproFest UK and the Brighton, Camden, and Edinburgh Fringe Festivals Mathewson and Mirowski (2017). Over these first 30 shows, one or two humans performed improvised scenes with the AI. The performers strove to endow the AI with human qualities of character/personality, relationship, status, emotion, perspective, and intelligence, according to common rules of improvisation Johnstone (2012); Napier (2004)

. Relying on custom state-of-the-art neural network software for language understanding and text generation, we were able to produce context-dependent replies for the AI actor.

The system we developed aims to maintain the illusion of intelligent dialogue. Improvised scenes developed emotional connections between imaginary characters played by humans and AI improvisors. The human-like characterization elicited attachment for the AI from audience members. Through various configurations (e.g. human-human, human-AI, and AI-AI) and different AI embodiments (e.g. voice alone, visual avatar, or robot), we challenged the audience to discriminate between human- and AI-led improvisation. In one particular game setup, through a Wizard-of-Oz illusion, we performed a Turing test inspired structure. We deceived the audience into believing that an AI was performing, then we asked them to compare that performance with a performance by an actual AI. Feedback from the audience, and from performers who have experimented with our system, provide insight for future development of improv games. Below we present details on how we debuted this technology to audiences, and provide strictl anecdotal observations collected over multiple performances.

2 Methods

We named our AI improviser A.L.Ex, the Artificial Language Experiment, an homage to Alex the Parrot, trained to communicate using a vocabulary of 150 words Pepperberg and Pepperberg (2009)

. The core of A.L.Ex consists of a text-based chatbot implemented as a word-level sequence-to-sequence recurrent neural network (4-layer LSTM encoder, similar decoder, and topic model inputs) with an output vocabulary of 50k words. The network was trained on cleaned and filtered subtitles from about 100k films

AAASubtitles from 100k movies were collected from

. Dialogue turn-taking, candidate sentence selection, and sentiment analysis

Hutto and Gilbert (2014)

on the input sentences are based on heuristics. The chatbot communicates with performers through out-of-the-box speech recognition and text-to-speech software. The chatbot runs on a local web server for modularity and allows for integration with physical embodiments (e.g. parallel control of a humanoid robot

BBBThe robot was manufactured by The server also enables remote connection which can override the chatbot and give dialog control to a human operator. Further technological implementation details are provided by Mathewson and Mirowski (2017).

An improvisational scene starts by soliciting suggestion for context from the audience (e.g., “non-geographical location” or “advice a grandparent might give”). The human performer then says several lines of dialogue to prime the AI with dense context. The scene continues through alternating lines of dialog between the human improviser(s) and the AI. Often through human justification, performers aim to maintain scene reality and ground narrative in believable storytelling. A typical scene lasts between 3-6 minutes, and is interrupted by the human performer when it reaches a natural ending (e.g. narrative conclusion or comical high point).

The first versions of the improvising artificial stage companions had their stage presence reduced to projected video and amplified sound. We evolved to physical embodiments (i.e. the humanoid robot) to project the attention of the performer(s) and audience on a material avatar. Our robotic performers are distinctly non-human in size, shape, material, actuation and lighting. We chose humanoid robotics because the more realistic an embodiment is the more comfortable humans often are with it; though comfort sharply drops when creatures have human-like qualities but are distinctly non-human Mori (1970).

The performances at the Camden and Edinburgh Fringe festivals involved a Turing test inspired scene conducted with the willing audience. We performed the scene by first deceiving the audience into believing that an AI was performing (whereas the chatbot and the robot were controlled by a human); then we performed a second scene with an actual AI. In game (1), we explained the Turing test first, then performed the two scenes consecutively and finally asked the audience to discriminate, through a vote, which scene was AI-led. In a different game (2), we performed the Wizard-of-Oz scene and then immediately asked, in character and as part of the performance, if the audience suspected that a human was in control of the chatbot.

3 Preliminary Observations and Conclusions

We summarize here anecdotal observations from our performance. In game (1), nearly everyone identified the AI from the human. However, we noted that in game (2) approximately half the audience members believed that an AI was performing flawlessly alongside human improvisor(s). When not forewarned about the Turing test, the audience (of various ages and genders) was convinced that the dialog system understood the details of the scene and responded immediately and contextually. The propensity of this delusion is likely driven by several factors: the context within which they are viewing the deception, the lack of personal awareness of the current state-of-the-art AI abilities, and emotional connections with the scene. Post-show discussions with audience members confirmed that when a performer tells the audience that an AI is controlling the robot’s dialogue, the audience members will trust this information. Being at an improvisational show, they expect to suspend disbelief and use their imagination. Most of them were also unaware of capabilities and limitations of state-of-the-art AI systems, which highlights the responsibility of the AI community to communicate progress in AI effectively and to effectively invite public understanding of AI ability. Finally, we observed that the introduction of a humanoid robot, with a human-like voice, increased the audiences’ propensity to immerse themselves in the imaginative narrative presented to them.

We plan to conduct an experimental study of the audience beliefs in shared AI and human creativitySchmidhuber (2010). We hope to better understand the way that audiences enjoy art when co-created by humans and AIs, to create better tools and mediums for human expression.


  • Magerko et al. [2009] Brian Magerko, Waleed Manzoul, Mark Riedl, Allan Baumer, Daniel Fuller, Kurt Luther, and Celia Pearce. An empirical study of cognition and theatrical improvisation. In Proceedings of the seventh ACM conference on Creativity and cognition, pages 117–126. ACM, 2009.
  • Perlin and Goldberg [1996] Ken Perlin and Athomas Goldberg. Improv: A system for scripting interactive actors in virtual worlds. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pages 205–216. ACM, 1996.
  • Hayes-Roth and Van Gent [1996] Barbara Hayes-Roth and Robert Van Gent. Improvisational puppets, actors, and avatars. In Computer Game Developers Conference, 1996.
  • Mathewson and Mirowski [2017] Kory Mathewson and Piotr Mirowski. Improvised theatre alongside artificial intelligences. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2017.
  • Johnstone [2012] Keith Johnstone. Impro: Improvisation and the theatre. Routledge, 2012.
  • Napier [2004] Mick Napier. Improvise: Scene from the inside out. Heinemann Drama, 2004.
  • Pepperberg and Pepperberg [2009] Irene M Pepperberg and Irene M Pepperberg. The Alex studies: cognitive and communicative abilities of grey parrots. Harvard University Press, 2009.
  • Hutto and Gilbert [2014] Clayton J Hutto and Eric Gilbert. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media, 2014.
  • Mori [1970] Masahiro Mori. The uncanny valley. Energy, 7(4):33–35, 1970.
  • Schmidhuber [2010] Jürgen Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247, 2010.

Supplementary Material


We would like to acknowledge the fantastic collaborators and supporters throughout the project, in particular Adam Meggido, Colin Mochrie, Matt Schuurman, Paul Blinov, Lana Cuthbertson, Patrick Pilarski, as well as Alessia Pannese, Stephen Davidson, Stuart Moses, Roisin Rae, John Agapiou, Arfie Mansfield, Luba Elliott, Shama Rahman, Holly Bartolo, Benoist Brucker, Charles Sabourdin, Katy Schutte and Steve Roe. Thank you to Rapid Fire Theatre for a space for new ideas to flourish.


Figure 1: System diagram of the Artificial Language Experiment (A.L.Ex).
Figure 2: Visual and physical embodiments of the AI improviser.
Figure 3: Two human performers and an audience volunteer improvising with a robot.