PDFVQA: A New Dataset for Real-World VQA on PDF Documents

04/13/2023
by   Yihao Ding, et al.
0

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions. We proposed a new document-based VQA dataset, PDF-VQA, to comprehensively examine the document understanding from various aspects, including document element recognition, document layout structural understanding as well as contextual understanding and key information extraction. Our PDF-VQA dataset extends the current scale of document understanding that limits on the single document page to the new scale that asks questions over the full document of multiple pages. We also propose a new graph-based VQA model that explicitly integrates the spatial and hierarchically structural relationships between different document elements to boost the document structural understanding. The performances are compared with several baselines over different question types and tasks[The full dataset will be released after paper acceptance.]

READ FULL TEXT
research
11/10/2021

ICDAR 2021 Competition on Document VisualQuestion Answering

In this report we present results of the ICDAR 2021 edition of the Docum...
research
07/31/2023

Workshop on Document Intelligence Understanding

Document understanding and information extraction include different task...
research
07/25/2022

Towards Complex Document Understanding By Discrete Reasoning

Document Visual Question Answering (VQA) aims to understand visually-ric...
research
05/23/2023

DUBLIN – Document Understanding By Language-Image Network

Visual document understanding is a complex task that involves analyzing ...
research
05/15/2023

Document Understanding Dataset and Evaluation (DUDE)

We call on the Document AI (DocAI) community to reevaluate current metho...
research
05/16/2023

DLUE: Benchmarking Document Language Understanding

Understanding documents is central to many real-world tasks but remains ...
research
04/26/2021

InfographicVQA

Infographics are documents designed to effectively communicate informati...

Please sign up or login with your details

Forgot password? Click here to reset