Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text Documents with Semantic-Oriented Hierarchical Graphs

05/03/2023
by   Fengbin Zhu, et al.
3

Discrete reasoning over table-text documents (e.g., financial reports) gains increasing attention in recent two years. Existing works mostly simplify this challenge by manually selecting and transforming document pages to structured tables and paragraphs, hindering their practical application. In this work, we explore a more realistic problem setting in the form of TAT-DQA, i.e. to answer the question over a visually-rich table-text document. Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability by harnessing the differences and correlations among different elements (e.g., quantities, dates) of the given question and document with Semantic-oriented hierarchical Graph structures. We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73 respectively on the test set, achieving the new state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2022

Towards Complex Document Understanding By Discrete Reasoning

Document Visual Question Answering (VQA) aims to understand visually-ric...
research
03/22/2019

Line-items and table understanding in structured documents

Table detection and extraction has been studied in the context of docume...
research
09/16/2023

PDFTriage: Question Answering over Long, Structured Documents

Large Language Models (LLMs) have issues with document question answerin...
research
03/27/2023

TabIQA: Table Questions Answering on Business Document Images

Table answering questions from business documents has many challenges th...
research
10/14/2020

A Graph Representation of Semi-structured Data for Web Question Answering

The abundant semi-structured data on the Web, such as HTML-based tables ...
research
11/05/2019

DocParser: Hierarchical Structure Parsing of Document Renderings

Translating document renderings (e.g. PDFs, scans) into hierarchical str...
research
05/30/2023

Table Detection for Visually Rich Document Images

Table Detection (TD) is a fundamental task towards visually rich documen...

Please sign up or login with your details

Forgot password? Click here to reset