A Survey on Interpretable Cross-modal Reasoning

09/05/2023
by   Dizhan Xue, et al.
0

In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics. As the deployment of AI systems becomes more ubiquitous, the demand for transparency and comprehensibility in these systems' decision-making processes has intensified. This survey delves into the realm of interpretable cross-modal reasoning (I-CMR), where the objective is not only to achieve high predictive performance but also to provide human-understandable explanations for the results. This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR. Furthermore, this survey reviews the existing CMR datasets with annotations for explanations. Finally, this survey summarizes the challenges for I-CMR and discusses potential future directions. In conclusion, this survey aims to catalyze the progress of this emerging research area by providing researchers with a panoramic and comprehensive perspective, illuminating the state of the art and discerning the opportunities. The summarized methods, datasets, and other resources are available at https://github.com/ZuyiZhou/Awesome-Interpretable-Cross-modal-Reasoning.

READ FULL TEXT
research
08/28/2023

Cross-Modal Retrieval: A Systematic Review of Methods and Future Directions

With the exponential surge in diverse multi-modal data, traditional uni-...
research
05/15/2021

Premise-based Multimodal Reasoning: A Human-like Cognitive Process

Reasoning is one of the major challenges of Human-like AI and has recent...
research
08/15/2023

EMID: An Emotional Aligned Dataset in Audio-Visual Modality

In this paper, we propose Emotionally paired Music and Image Dataset (EM...
research
06/07/2023

A Survey on Knowledge Graphs for Healthcare: Resources, Applications, and Promises

Healthcare knowledge graphs (HKGs) have emerged as a promising tool for ...
research
07/01/2023

SHARCS: Shared Concept Space for Explainable Multimodal Learning

Multimodal learning is an essential paradigm for addressing complex real...
research
03/28/2022

Image-text Retrieval: A Survey on Recent Research and Development

In the past few years, cross-modal image-text retrieval (ITR) has experi...
research
08/20/2022

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Sight and hearing are two senses that play a vital role in human communi...

Please sign up or login with your details

Forgot password? Click here to reset