Contrastive Document Representation Learning with Graph Attention Networks

10/20/2021
by   Peng Xu, et al.
0

Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

READ FULL TEXT

page 5

page 8

research
04/15/2020

Document-level Representation Learning using Citation-informed Transformers

Representation learning is a critical ingredient for natural language pr...
research
09/15/2020

Cascaded Semantic and Positional Self-Attention Network for Document Classification

Transformers have shown great success in learning representations for la...
research
02/22/2022

Socialformer: Social Network Inspired Long Document Modeling for Document Ranking

Utilizing pre-trained language models has achieved great success for neu...
research
05/26/2023

Three Towers: Flexible Contrastive Learning with Pretrained Image Models

We introduce Three Towers (3T), a flexible method to improve the contras...
research
02/18/2022

Modelling the semantics of text in complex document layouts using graph transformer networks

Representing structured text from complex documents typically calls for ...
research
05/23/2022

Contrastive Representation Learning for Cross-Document Coreference Resolution of Events and Entities

Identifying related entities and events within and across documents is f...
research
10/24/2020

ReadOnce Transformers: Reusable Representations of Text for Transformers

While large-scale language models are extremely effective when directly ...

Please sign up or login with your details

Forgot password? Click here to reset