Plug-and-Play Document Modules for Pre-trained Models

05/28/2023
by   Chaojun Xiao, et al.
0

Large-scale pre-trained models (PTMs) have been widely used in document-oriented NLP tasks, such as question answering. However, the encoding-task coupling requirement results in the repeated encoding of the same documents for different tasks and queries, which is highly computationally inefficient. To this end, we target to decouple document encoding from downstream tasks, and propose to represent each document as a plug-and-play document module, i.e., a document plugin, for PTMs (PlugD). By inserting document plugins into the backbone PTM for downstream tasks, we can encode a document one time to handle multiple tasks, which is more efficient than conventional encoding-task coupling methods that simultaneously encode documents and input queries using task-specific encoders. Extensive experiments on 8 datasets of 4 typical NLP tasks show that PlugD enables models to encode documents once and for all across different scenarios. Especially, PlugD can save 69% computational costs while achieving comparable performance to state-of-the-art encoding-task coupling methods. Additionally, we show that PlugD can serve as an effective post-processing way to inject knowledge into task-specific models, improving model performance without any additional model training.

READ FULL TEXT

page 1

page 4

page 8

research
07/24/2022

No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

Pre-trained models have been shown effective in many code intelligence t...
research
10/06/2022

XDoc: Unified Pre-training for Cross-Format Document Understanding

The surge of pre-training has witnessed the rapid development of documen...
research
05/22/2020

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Large pre-trained language models have been shown to store factual knowl...
research
05/24/2021

StructuralLM: Structural Pre-training for Form Understanding

Large pre-trained language models achieve state-of-the-art results when ...
research
11/12/2021

Thermodynamics of Encoding and Encoders

Non-isolated systems have diverse coupling relations with the external e...
research
04/04/2023

G2PTL: A Pre-trained Model for Delivery Address and its Applications in Logistics System

Text-based delivery addresses, as the data foundation for logistics syst...
research
10/06/2022

Distilling Task-specific Logical Rules from Large Pre-trained Models

Logical rules, both transferable and explainable, are widely used as wea...

Please sign up or login with your details

Forgot password? Click here to reset