MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

09/09/2023
by   Weihao Liu, et al.
0

In the real world, knowledge often exists in a multimodal and heterogeneous form. Addressing the task of question answering with hybrid data types, including text, tables, and images, is a challenging task (MMHQA). Recently, with the rise of large language models (LLM), in-context learning (ICL) has become the most popular way to solve QA problems. We propose MMHQA-ICL framework for addressing this problems, which includes stronger heterogeneous data retriever and an image caption module. Most importantly, we propose a Type-specific In-context Learning Strategy for MMHQA, enabling LLMs to leverage their powerful performance in this task. We are the first to use end-to-end LLM prompting method for this task. Experimental results demonstrate that our framework outperforms all baselines and methods trained on the full dataset, achieving state-of-the-art results under the few-shot setting on the MultimodalQA dataset.

READ FULL TEXT

page 2

page 8

research
08/04/2017

MemexQA: Visual Memex Question Answering

This paper proposes a new task, MemexQA: given a collection of photos or...
research
04/27/2019

Using Context Information to Enhance Simple Question Answering

With the rapid development of knowledge bases(KBs),question answering(QA...
research
07/06/2022

BioTABQA: Instruction Learning for Biomedical Table Question Answering

Table Question Answering (TQA) is an important but under-explored task. ...
research
04/25/2020

MCQA: Multimodal Co-attention Based Network for Question Answering

We present MCQA, a learning-based algorithm for multimodal question answ...
research
04/24/2023

Extreme Classification for Answer Type Prediction in Question Answering

Semantic answer type prediction (SMART) is known to be a useful step tow...
research
08/20/2023

LibriSQA: Advancing Free-form and Open-ended Spoken Question Answering with a Novel Dataset and Framework

While Large Language Models (LLMs) have demonstrated commendable perform...
research
04/25/2022

Conversational Question Answering on Heterogeneous Sources

Conversational question answering (ConvQA) tackles sequential informatio...

Please sign up or login with your details

Forgot password? Click here to reset