Interactive Mobile App Navigation with Uncertain or Under-specified Natural Language Commands

02/04/2022
by   Andrea Burns, et al.
0

We introduce Mobile app Tasks with Iterative Feedback (MoTIF), a new dataset where the goal is to complete a natural language query in a mobile app. Current datasets for related tasks in interactive question answering, visual common sense reasoning, and question-answer plausibility prediction do not support research in resolving ambiguous natural language requests or operating in diverse digital domains. As a result, they fail to capture complexities of real question answering or interactive tasks. In contrast, MoTIF contains natural language requests that are not satisfiable, the first such work to investigate this issue for interactive vision-language tasks. MoTIF also contains follow up questions for ambiguous queries to enable research on task uncertainty resolution. We introduce task feasibility prediction and propose an initial model which obtains an F1 score of 61.1. We next benchmark task automation with our dataset and find adaptations of prior work perform poorly due to our realistic language requests, obtaining an accuracy of only 20.2 commands to grounded actions. We analyze performance and gain insight for future work that may bridge the gap between current model ability and what is needed for successful use in application.

READ FULL TEXT

page 4

page 12

page 16

page 17

research
04/17/2021

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments

In recent years, vision-language research has shifted to study tasks whi...
research
08/14/2019

VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering

Embodied Question Answering (EQA) is a recently proposed task, where an ...
research
09/16/2022

ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots

We present a new task and dataset, ScreenQA, for screen content understa...
research
10/19/2021

Open-domain clarification question generation without question examples

An overarching goal of natural language processing is to enable machines...
research
04/14/2023

DroidBot-GPT: GPT-powered UI Automation for Android

This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large ...
research
02/19/2020

Interactive Natural Language-based Person Search

In this work, we consider the problem of searching people in an unconstr...
research
10/22/2017

Natural Language Aggregate Query over RDF Data

Natural language question-answering over RDF data has received widesprea...

Please sign up or login with your details

Forgot password? Click here to reset