Learning UI Navigation through Demonstrations composed of Macro Actions

10/16/2021
by   Wei Li, et al.
0

We have developed a framework to reliably build agents capable of UI navigation. The state space is simplified from raw-pixels to a set of UI elements extracted from screen understanding, such as OCR and icon detection. The action space is restricted to the UI elements plus a few global actions. Actions can be customized for tasks and each action is a sequence of basic operations conditioned on status checks. With such a design, we are able to train DQfD and BC agents with a small number of demonstration episodes. We propose demo augmentation that significantly reduces the required number of human demonstrations. We made a customization of DQfD to allow demos collected on screenshots to facilitate the demo coverage of rare cases. Demos are only collected for the failed cases during the evaluation of the previous version of the agent. With 10s of iterations looping over evaluation, demo collection, and training, the agent reaches a 98.7% success rate on the search task in an environment of 80+ apps and websites where initial states and viewing parameters are randomized.

READ FULL TEXT

page 5

page 7

page 9

page 12

research
09/18/2023

One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers

Learning from human demonstrations (behavior cloning) is a cornerstone o...
research
06/20/2023

Cooperative Multi-Agent Learning for Navigation via Structured State Abstraction

Cooperative multi-agent reinforcement learning (MARL) for navigation ena...
research
12/27/2022

Behavioral Cloning via Search in Video PreTraining Latent Space

Our aim is to build autonomous agents that can solve tasks in environmen...
research
01/15/2014

Interactive Policy Learning through Confidence-Based Autonomy

We present Confidence-Based Autonomy (CBA), an interactive algorithm for...
research
04/07/2023

Bridging Action Space Mismatch in Learning from Demonstrations

Learning from demonstrations (LfD) methods guide learning agents to a de...
research
10/01/2018

Multimodal Interactive Learning of Primitive Actions

We describe an ongoing project in learning to perform primitive actions ...

Please sign up or login with your details

Forgot password? Click here to reset