Workshop on Document Intelligence Understanding

07/31/2023
by   Soyeon Caren Han, et al.
0

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically. Recently, there has been a rising demand for developing document understanding among different domains, including business, law, and medicine, to boost the efficiency of work that is associated with a large number of documents. This workshop aims to bring together researchers and industry developers in the field of document intelligence and understanding diverse document types to boost automatic document processing and understanding techniques. We also released a data challenge on the recently introduced document-level VQA dataset, PDFVQA. The PDFVQA challenge examines the structural and contextual understandings of proposed models on the natural full document level of multiple consecutive document pages by including questions with a sequence of answers extracted from multi-pages of the full document. This task helps to boost the document understanding step from the single-page level to the full document level understanding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2023

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

Document-based Visual Question Answering examines the document understan...
research
07/25/2022

Towards Complex Document Understanding By Discrete Reasoning

Document Visual Question Answering (VQA) aims to understand visually-ric...
research
05/01/2020

SciREX: A Challenge Dataset for Document-Level Information Extraction

Extracting information from full documents is an important problem in ma...
research
11/27/2020

A Survey of Deep Learning Approaches for OCR and Document Understanding

Documents are a core part of many businesses in many fields such as law,...
research
05/15/2023

Document Understanding Dataset and Evaluation (DUDE)

We call on the Document AI (DocAI) community to reevaluate current metho...
research
06/24/2015

Unshredding of Shredded Documents: Computational Framework and Implementation

A shredded document D is a document whose pages have been cut into strip...
research
11/24/2021

Handling tree-structured text: parsing directory pages

The determination of the reading sequence of text is fundamental to docu...

Please sign up or login with your details

Forgot password? Click here to reset