Multi-Modal End-User Programming of Web-Based Virtual Assistant Skills

by   Michael H. Fischer, et al.

While Alexa can perform over 100,000 skills on paper, its capability covers only a fraction of what is possible on the web. To reach the full potential of an assistant, it is desirable that individuals can create skills to automate their personal web browsing routines. Many seemingly simple routines, however, such as monitoring COVID-19 stats for their hometown, detecting changes in their child's grades online, or sending personally-addressed messages to a group, cannot be automated without conventional programming concepts such as conditional and iterative evaluation. This paper presents VASH (Voice Assistant Scripting Helper), a new system that empowers users to create useful web-based virtual assistant skills without learning a formal programming language. With VASH, the user demonstrates their task of interest in the browser and issues a few voice commands, such as naming the skills and adding conditions on the action. VASH turns these multi-modal specifications into skills that can be invoked invoice on a virtual assistant. These skills are represented in a formal programming language we designed called WebTalk, which supports parameterization, function invocation, conditionals, and iterative execution. VASH is a fully working prototype that works on the Chrome browser on real-world websites. Our user study shows that users have many web routines they wish to automate, 81 VASH Is easy to learn, and that a majority of the users in our study want to use our system.



There are no comments yet.


page 1

page 11


"Are you home alone?" "Yes" Disclosing Security and Privacy Vulnerabilities in Alexa Skills

The home voice assistants such as Amazon Alexa have become increasingly ...

An Experiment with a User Manual of a Programming Language Based on a Denotational Semantics

Denotational models should provide an opportunity for the revision of cu...

PUMICE: A Multi-Modal Agent that Learns Concepts and Conditionals from Natural Language and Demonstrations

Natural language programming is a promising approach to enable end users...

Schema2QA: Answering Complex Queries on the Structured Web with a Neural Model

Virtual assistants today require every website to submit skills individu...

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

The modulation of voice properties, such as pitch, volume, and speed, is...

WebRobot: Web Robotic Process Automation using Interactive Programming-by-Demonstration

It is imperative to democratize robotic process automation (RPA), as RPA...

Automated Refactoring of Nested-IF Formulae in Spreadsheets

Spreadsheets are the most popular end-user programming software, where f...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.