DroidBot-GPT: GPT-powered UI Automation for Android

04/14/2023
by   Hao Wen, et al.
0

This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large language models (LLMs) to automate the interactions with Android mobile applications. Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task. It works by translating the app GUI state information and the available actions on the smartphone screen to natural language prompts and asking the LLM to make a choice of actions. Since the LLM is typically trained on a large amount of data including the how-to manuals of diverse software applications, it has the ability to make reasonable choices of actions based on the provided information. We evaluate DroidBot-GPT with a self-created dataset that contains 33 tasks collected from 17 Android applications spanning 10 categories. It can successfully complete 39.39 average partial completion progress is about 66.76 method is fully unsupervised (no modification required from both the app and the LLM), we believe there is great potential to enhance automation performance with better app development paradigms and/or custom model training.

READ FULL TEXT

page 2

page 3

page 4

page 6

research
03/08/2021

V2S: A Tool for Translating Video Recordings of Mobile App Usages into Replayable Scenarios

Screen recordings are becoming increasingly important as rich software a...
research
08/29/2023

Empowering LLM to use Smartphone for Intelligent Task Automation

Mobile task automation is an attractive technique that aims to enable vo...
research
07/19/2023

Android in the Wild: A Large-Scale Dataset for Android Device Control

There is a growing interest in device-control systems that can interpret...
research
02/04/2022

Interactive Mobile App Navigation with Uncertain or Under-specified Natural Language Commands

We introduce Mobile app Tasks with Iterative Feedback (MoTIF), a new dat...
research
09/15/2023

MAPLE: Mobile App Prediction Leveraging Large Language model Embeddings

Despite the rapid advancement of mobile applications, predicting app usa...
research
04/05/2022

PSDoodle: Fast App Screen Search via Partial Screen Doodle

Searching through existing repositories for a specific mobile app screen...

Please sign up or login with your details

Forgot password? Click here to reset