MAEA: Multimodal Attribution for Embodied AI

07/25/2023
by   Vidhi Jain, et al.
0

Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies for language and visual attributions.

READ FULL TEXT

page 1

page 4

research
11/01/2018

Shifting the Baseline: Single Modality Performance on Visual Navigation & QA

Language-and-vision navigation and question answering (QA) are exciting ...
research
11/14/2020

On the Benefits of Early Fusion in Multimodal Representation Learning

Intelligently reasoning about the world often requires integrating data ...
research
01/22/2023

Self-driving Multimodal Studies at User Facilities

Multimodal characterization is commonly required for understanding mater...
research
05/29/2023

Contextual Object Detection with Multimodal Large Language Models

Recent Multimodal Large Language Models (MLLMs) are remarkable in vision...
research
09/03/2022

Multimodal and Crossmodal AI for Smart Data Analysis

Recently, the multimodal and crossmodal AI techniques have attracted the...
research
10/31/2020

Personalized Multimodal Feedback Generation in Education

The automatic evaluation for school assignments is an important applicat...
research
05/30/2017

Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation

Multisensory polices are known to enhance both state estimation and targ...

Please sign up or login with your details

Forgot password? Click here to reset