Grounding Open-Domain Instructions to Automate Web Support Tasks

03/30/2021
by   Nancy Xu, et al.
2

Grounding natural language instructions on the web to perform previously unseen tasks enables accessibility and automation. We introduce a task and dataset to train AI agents from open-domain, step-by-step instructions originally written for people. We build RUSS (Rapid Universal Support Service) to tackle this problem. RUSS consists of two models: First, a BERT-LSTM with pointers parses instructions to ThingTalk, a domain-specific language we design for grounding natural language on the web. Then, a grounding model retrieves the unique IDs of any webpage elements requested in ThingTalk. RUSS may interact with the user through a dialogue (e.g. ask for an address) or execute a web operation (e.g. click a button) inside the web runtime. To augment training, we synthesize natural language instructions mapped to ThingTalk. Our dataset consists of 80 different customer service problems from help websites, with a total of 741 step-by-step instructions and their corresponding actions. RUSS achieves 76.7 instructions. It outperforms state-of-the-art models that directly map instructions to actions without ThingTalk. Our user study shows that RUSS is preferred by actual users over web navigation.

READ FULL TEXT
research
09/17/2021

Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?

Models designed for intelligent process automation are required to be ca...
research
10/24/2020

FLIN: A Flexible Natural Language Interface for Web Navigation

AI assistants have started carrying out tasks on a user's behalf by inte...
research
07/19/2023

Android in the Wild: A Large-Scale Dataset for Android Device Control

There is a growing interest in device-control systems that can interpret...
research
01/13/2020

Towards Evaluating Plan Generation Approaches with Instructional Texts

Recent research in behaviour understanding through language grounding ha...
research
06/27/2021

Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

Forming and interpreting abstraction is a core process in human communic...
research
08/10/2023

DiLogics: Creating Web Automation Programs With Diverse Logics

Knowledge workers frequently encounter repetitive web data entry tasks, ...
research
09/20/2023

You Only Look at Screens: Multimodal Chain-of-Action Agents

Autonomous user interface (UI) agents aim to facilitate task automation ...

Please sign up or login with your details

Forgot password? Click here to reset