Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents

07/08/2020
by   Eda Okur, et al.
0

Building multimodal dialogue understanding capabilities situated in the in-cabin context is crucial to enhance passenger comfort in autonomous vehicle (AV) interaction systems. To this end, understanding passenger intents from spoken interactions and vehicle vision systems is a crucial component for developing contextual and visually grounded conversational agents for AV. Towards this goal, we explore AMIE (Automated-vehicle Multimodal In-cabin Experience), the in-cabin agent responsible for handling multimodal passenger-vehicle interactions. In this work, we discuss the benefits of a multimodal understanding of in-cabin utterances by incorporating verbal/language input together with the non-verbal/acoustic and visual clues from inside and outside the vehicle. Our experimental results outperformed text-only baselines as we achieved improved performances for intent detection with a multimodal approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2019

Towards Multimodal Understanding of Passenger-Vehicle Interactions in Autonomous Vehicles: Intent/Slot Recognition Utilizing Audio-Visual Data

Understanding passenger intents from spoken interactions and car's visio...
research
11/22/2021

Building Goal-Oriented Dialogue Systems with Situated Visual Context

Most popular goal-oriented dialogue agents are capable of understanding ...
research
12/14/2018

Conversational Intent Understanding for Passengers in Autonomous Vehicles

Understanding passenger intents and extracting relevant slots are import...
research
10/20/2018

A Knowledge-Grounded Multimodal Search-Based Conversational Agent

Multimodal search-based dialogue is a challenging new task: It extends v...
research
06/02/2020

Situated and Interactive Multimodal Conversations

Next generation virtual assistants are envisioned to handle multimodal i...
research
12/28/2019

All-in-One Image-Grounded Conversational Agents

As single-task accuracy on individual language and image tasks has impro...
research
04/23/2019

Natural Language Interactions in Autonomous Vehicles: Intent Detection and Slot Filling from Passenger Utterances

Understanding passenger intents and extracting relevant slots are import...

Please sign up or login with your details

Forgot password? Click here to reset