Table Detection for Visually Rich Document Images

05/30/2023
by   Bin Xiao, et al.
0

Table Detection (TD) is a fundamental task towards visually rich document understanding. Current studies usually formulate the TD problem as an object detection problem, then leverage Intersection over Union (IoU) based metrics to evaluate the model performance and IoU-based loss functions to optimize the model. TD applications usually require the prediction results to cover all the table contents and avoid information loss. However, IoU and IoU-based loss functions cannot directly reflect the degree of information loss for the prediction results. Therefore, we propose to decouple IoU into a ground truth coverage term and a prediction coverage term, in which the former can be used to measure the information loss of the prediction results. Besides, tables in the documents are usually large, sparsely distributed, and have no overlaps because they are designed to summarize essential information to make it easy to read and interpret for human readers. Therefore, in this study, we use SparseR-CNN as the base model, and further improve the model by using Gaussian Noise Augmented Image Size region proposals and many-to-one label assignments. To demonstrate the effectiveness of proposed method and compare with state-of-the-art methods fairly, we conduct experiments and use IoU-based evaluation metrics to evaluate the model performance. The experimental results show that the proposed method can consistently outperform state-of-the-art methods under different IoU-based metric on a variety of datasets. We conduct further experiments to show the superiority of the proposed decoupled IoU for the TD applications by replacing the IoU-based loss functions and evaluation metrics with proposed decoupled IoU counterparts. The experimental results show that our proposed decoupled IoU loss can encourage the model to alleviate information loss.

READ FULL TEXT

page 6

page 18

research
05/04/2023

Revisiting Table Detection Datasets for Visually Rich Documents

Table Detection has become a fundamental task for visually rich document...
research
10/22/2021

C^4Net: Contextual Compression and Complementary Combination Network for Salient Object Detection

Deep learning solutions of the salient object detection problem have ach...
research
05/23/2022

Document Intelligence Metrics for Visually Rich Document Evaluation

The processing of Visually-Rich Documents (VRDs) is highly important in ...
research
02/16/2022

On loss functions and evaluation metrics for music source separation

We investigate which loss functions provide better separations via bench...
research
05/03/2023

Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text Documents with Semantic-Oriented Hierarchical Graphs

Discrete reasoning over table-text documents (e.g., financial reports) g...
research
08/11/2022

Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach

Due to the characteristics of Information and Communications Technology ...
research
06/04/2021

Improving Computer Generated Dialog with Auxiliary Loss Functions and Custom Evaluation Metrics

Although people have the ability to engage in vapid dialogue without eff...

Please sign up or login with your details

Forgot password? Click here to reset