PUMICE: A Multi-Modal Agent that Learns Concepts and Conditionals from Natural Language and Demonstrations

08/30/2019
by   Toby Jia-Jun Li, et al.
0

Natural language programming is a promising approach to enable end users to instruct new tasks for intelligent agents. However, our formative study found that end users would often use unclear, ambiguous or vague concepts when naturally instructing tasks in natural language, especially when specifying conditionals. Existing systems have limited support for letting the user teach agents new concepts or explaining unclear concepts. In this paper, we describe a new multi-modal domain-independent approach that combines natural language programming and programming-by-demonstration to allow users to first naturally describe tasks and associated conditions at a high level, and then collaborate with the agent to recursively resolve any ambiguities or vagueness through conversations and demonstrations. Users can also define new procedures and concepts by demonstrating and referring to contents within GUIs of existing mobile apps. We demonstrate this approach in PUMICE, an end-user programmable agent that implements this approach. A lab study with 10 users showed its usability.

READ FULL TEXT

page 1

page 6

page 9

research
06/30/2016

Towards A Virtual Assistant That Can Be Taught New Tasks In Any Domain By Its End-Users

The challenge stated in the title can be divided into two main problems....
research
03/06/2018

Precise but Natural Specification for Robot Tasks

We present Flipper, a natural language interface for describing high lev...
research
09/03/2021

Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis

Multi-modal program synthesis refers to the task of synthesizing program...
research
08/24/2020

Multi-Modal End-User Programming of Web-Based Virtual Assistant Skills

While Alexa can perform over 100,000 skills on paper, its capability cov...
research
07/31/2019

Disentangled Relational Representations for Explaining and Learning from Demonstration

Learning from demonstration is an effective method for human users to in...
research
11/14/2022

UGIF: UI Grounded Instruction Following

New smartphone users have difficulty engaging with it and often use only...
research
05/02/2020

Benchmarking Multimodal Regex Synthesis with Complex Structures

Existing datasets for regular expression (regex) generation from natural...

Please sign up or login with your details

Forgot password? Click here to reset