LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

02/28/2022
by   Jiapeng Wang, et al.
0

Structured document understanding has attracted considerable attention and made significant progress recently, owing to its crucial role in intelligent document processing. However, most existing related models can only deal with the document data of specific language(s) (typically English) included in the pre-training collection, which is extremely limited. To address this issue, we propose a simple yet effective Language-independent Layout Transformer (LiLT) for structured document understanding. LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models. Experimental results on eight languages have shown that LiLT can achieve competitive or even superior performance on diverse widely-used downstream benchmarks, which enables language-independent benefit from the pre-training of document layout structure. Code and model are publicly available at https://github.com/jpWang/LiLT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2021

MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding

Multimodal pre-training with text, layout, and image has made significan...
research
05/08/2023

Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding

Recent works on form understanding mostly employ multimodal transformers...
research
12/15/2021

Value Retrieval with Arbitrary Queries for Form-like Documents

We propose value retrieval with arbitrary queries for form-like document...
research
07/14/2023

TALL: Thumbnail Layout for Deepfake Video Detection

The growing threats of deepfakes to society and cybersecurity have raise...
research
10/12/2022

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding

Recent years have witnessed the rise and success of pre-training techniq...
research
05/04/2015

Learning Document Image Binarization from Data

In this paper we present a fully trainable binarization solution for deg...
research
08/21/2023

Performance Enhancement Leveraging Mask-RCNN on Bengali Document Layout Analysis

Understanding digital documents is like solving a puzzle, especially his...

Please sign up or login with your details

Forgot password? Click here to reset