GVdoc: Graph-based Visual Document Classification

05/26/2023
by   Fnu Mohbat, et al.
0

The robustness of a model for real-world deployment is decided by how well it performs on unseen data and distinguishes between in-domain and out-of-domain samples. Visual document classifiers have shown impressive performance on in-distribution test sets. However, they tend to have a hard time correctly classifying and differentiating out-of-distribution examples. Image-based classifiers lack the text component, whereas multi-modality transformer-based models face the token serialization problem in visual documents due to their diverse layouts. They also require a lot of computing power during inference, making them impractical for many real-world applications. We propose, GVdoc, a graph-based document classification model that addresses both of these challenges. Our approach generates a document graph based on its layout, and then trains a graph neural network to learn node and graph embeddings. Through experiments, we show that our model, even with fewer parameters, outperforms state-of-the-art models on out-of-distribution data while retaining comparable performance on the in-distribution test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2021

Forget me not: A Gentle Reminder to Mind the Simple Multi-Layer Perceptron Baseline for Text Classification

Graph neural networks have triggered a resurgence of graph-based text cl...
research
07/09/2021

Graph-based Deep Generative Modelling for Document Layout Generation

One of the major prerequisites for any deep learning approach is the ava...
research
08/04/2023

Adaptive Preferential Attached kNN Graph with Distribution-Awareness

Graph-based kNN algorithms have garnered widespread popularity for machi...
research
10/14/2022

Evaluating Out-of-Distribution Performance on Document Image Classifiers

The ability of a document classifier to handle inputs that are drawn fro...
research
08/03/2023

A Graphical Approach to Document Layout Analysis

Document layout analysis (DLA) is the task of detecting the distinct, se...
research
10/26/2021

Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

For supervised learning with tabular data, decision tree ensembles produ...
research
03/03/2020

Image-based OoD-Detector Principles on Graph-based Input Data in Human Action Recognition

Living in a complex world like ours makes it unacceptable that a practic...

Please sign up or login with your details

Forgot password? Click here to reset