Deqiang Jiang

research

∙ 09/03/2023

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

We propose a novel end-to-end document understanding model called SeRum ...

0 Haoyu Cao, et al. ∙

research

∙ 06/06/2023

Looking and Listening: Audio Guided Text Recognition

Text recognition in the wild is a long-standing problem in computer visi...

0 Wenwen Yu, et al. ∙

research

∙ 05/12/2023

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

Visual information extraction (VIE), which aims to simultaneously perfor...

0 Jianfeng Kuang, et al. ∙

research

∙ 03/16/2023

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation

Recently, Table Structure Recognition (TSR) task, aiming at identifying ...

0 Hao Liu, et al. ∙

research

∙ 02/28/2023

Turning a CLIP Model into a Scene Text Detector

The recent large-scale Contrastive Language-Image Pretraining (CLIP) mod...

0 Wenwen Yu, et al. ∙

research

∙ 08/22/2022

TaCo: Textual Attribute Recognition via Contrastive Learning

As textual attributes like font are core design elements of document for...

0 Chang Nie, et al. ∙

research

∙ 07/11/2022

GMN: Generative Multi-modal Network for Practical Document Information Extraction

Document Information Extraction (DIE) has attracted increasing attention...

0 Haoyu Cao, et al. ∙

research

∙ 07/04/2022

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

Scene segmentation and classification (SSC) serve as a critical step tow...

0 Ye Liu, et al. ∙

research

∙ 05/22/2022

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

The task of Grammatical Error Correction (GEC) has received remarkable a...

0 Jiquan Li, et al. ∙

research

∙ 05/05/2022

Relational Representation Learning in Visually-Rich Documents

Relational understanding is critical for a number of visually-rich docum...

0 Xin Li, et al. ∙

research

∙ 04/18/2022

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

The self-supervised Masked Image Modeling (MIM) schema, following "mask-...

7 Hao Liu, et al. ∙

research

∙ 11/26/2021

Neural Collaborative Graph Machines for Table Structure Recognition

Recently, table structure recognition has achieved impressive progress w...

3 Hao Liu, et al. ∙

research

∙ 11/25/2021

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

Recently, Vision Transformers (ViT), with the self-attention (SA) as the...

0 Hao Liu, et al. ∙

research

∙ 02/26/2020

PuzzleNet: Scene Text Detection by Segment Context Graph Learning

Recently, a series of decomposition-based scene text detection methods h...

14 Hao Liu, et al. ∙

Deqiang Jiang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro