ECHo: Event Causality Inference via Human-centric Reasoning

05/24/2023
by   Yuxi Xie, et al.
0

We introduce ECHo, a diagnostic dataset of event causality inference grounded in visual-and-linguistic social scenarios. ECHo employs real-world human-centric deductive information collected from crime drama, bridging the gap in multimodal reasoning towards higher social intelligence through the elicitation of intermediate Theory-of-Mind (ToM). We propose a unified framework aligned with the Chain-of-Thought (CoT) paradigm to assess the reasoning capability of current AI systems. This ToM-enhanced CoT pipeline can accommodate and integrate various large foundation models in zero-shot visual-and-linguistic understanding. With this framework, we scrutinize the advanced large language and multimodal models via three complementary human-centric ECHo tasks. Further analysis demonstrates ECHo as a challenging dataset to expose imperfections and inconsistencies in reasoning.

READ FULL TEXT

page 1

page 3

page 6

page 8

page 9

page 13

research
03/20/2023

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

We propose MM-REACT, a system paradigm that integrates ChatGPT with a po...
research
03/04/2022

ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer for Event-Centric Generation and Classification

Generating new events given context with correlated ones plays a crucial...
research
04/05/2023

Personality-aware Human-centric Multimodal Reasoning: A New Task

Multimodal reasoning, an area of artificial intelligence that aims at ma...
research
09/08/2023

Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models

Vision-language models (VLMs) have recently demonstrated strong efficacy...
research
04/27/2023

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System

Existing deep video models are limited by specific tasks, fixed input-ou...
research
05/26/2023

Discovering Novel Actions in an Open World with Object-Grounded Visual Commonsense Reasoning

Learning to infer labels in an open world, i.e., in an environment where...
research
05/02/2023

Visual Reasoning: from State to Transformation

Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an im...

Please sign up or login with your details

Forgot password? Click here to reset