META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI

05/23/2022
by   Liangtai Sun, et al.
0

Task-oriented dialogue (TOD) systems have been widely used by mobile phone intelligent assistants to accomplish tasks such as calendar scheduling or hotel booking. Current TOD systems usually focus on multi-turn text/speech interaction and reply on calling back-end APIs to search database information or execute the task on mobile phone. However, this architecture greatly limits the information searching capability of intelligent assistants and may even lead to task failure if APIs are not available or the task is too complicated to be executed by the provided APIs. In this paper, we propose a new TOD architecture: GUI-based task-oriented dialogue system (GUI-TOD). A GUI-TOD system can directly perform GUI operations on real APPs and execute tasks without invoking backend APIs. Furthermore, we release META-GUI, a dataset for training a Multi-modal conversational agent on mobile GUI. We also propose a multi-model action prediction and response model. It showed promising results on META-GUI, but there is still room for further improvement. The dataset and models will be publicly available.

READ FULL TEXT

page 1

page 11

page 12

page 13

research
11/10/2022

MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation

Responding with multi-modal content has been recognized as an essential ...
research
07/19/2021

Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images

In multi-modal dialogue systems, it is important to allow the use of ima...
research
05/19/2023

MD3: The Multi-Dialect Dataset of Dialogues

We introduce a new dataset of conversational speech representing English...
research
07/28/2023

'What are you referring to?' Evaluating the Ability of Multi-Modal Dialogue Models to Process Clarificational Exchanges

Referential ambiguities arise in dialogue when a referring expression do...
research
09/09/2021

Fusing task-oriented and open-domain dialogues in conversational agents

The goal of building intelligent dialogue systems has largely been separ...
research
08/30/2019

Modeling Multi-Action Policy for Task-Oriented Dialogues

Dialogue management (DM) plays a key role in the quality of the interact...
research
07/04/2023

Unified Conversational Models with System-Initiated Transitions between Chit-Chat and Task-Oriented Dialogues

Spoken dialogue systems (SDSs) have been separately developed under two ...

Please sign up or login with your details

Forgot password? Click here to reset