Test-Time Adaptation for Visual Document Understanding

06/15/2022
by   Sayna Ebrahimi, et al.
0

Self-supervised pretraining has been able to produce transferable representations for various visual document understanding (VDU) tasks. However, the ability of such representations to adapt to new distribution shifts at test-time has not been studied yet. We propose DocTTA, a novel test-time adaptation approach for documents that leverages cross-modality self-supervised learning via masked visual language modeling as well as pseudo labeling to adapt models learned on a source domain to an unlabeled target domain at test time. We also introduce new benchmarks using existing public datasets for various VDU tasks including entity recognition, key-value extraction, and document visual question answering tasks where DocTTA improves the source model performance up to 1.79% in (F1 score), 3.43% (F1 score), and 17.68% (ANLS score), respectively while drastically reducing calibration error on target data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2022

Visual Prompt Tuning for Test-time Domain Adaptation

Models should have the ability to adapt to unseen data during test-time ...
research
04/21/2022

Contrastive Test-Time Adaptation

Test-time adaptation is a special setting of unsupervised domain adaptat...
research
10/18/2022

Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation

We consider a setting that a model needs to adapt to a new domain under ...
research
03/01/2023

Self-Supervised Convolutional Visual Prompts

Machine learning models often fail on out-of-distribution (OOD) samples....
research
06/07/2019

Classifying the reported ability in clinical mobility descriptions

Assessing how individuals perform different activities is key informatio...
research
08/09/2023

GeoAdapt: Self-Supervised Test-Time Adaption in LiDAR Place Recognition Using Geometric Priors

LiDAR place recognition approaches based on deep learning suffer a signi...
research
04/26/2023

Tissue Classification During Needle Insertion Using Self-Supervised Contrastive Learning and Optical Coherence Tomography

Needle positioning is essential for various medical applications such as...

Please sign up or login with your details

Forgot password? Click here to reset