Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

11/07/2022

∙

We present a dataset generator engine named Web-based Visual Corpus Builder (Webvicob). Webvicob can readily construct a large-scale visual corpus (i.e., images with text annotations) from a raw Wikipedia HTML dump. In this report, we validate that Webvicob-generated data can cover a wide range of context and knowledge and helps practitioners to build a powerful Visual Document Understanding (VDU) backbone. The proposed engine is publicly available at https://github.com/clovaai/webvicob.

READ FULL TEXT

Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

Sign in with Google

Consider DeepAI Pro