Multilevel Text Alignment with Cross-Document Attention

10/03/2020
by   Xuhui Zhou, et al.
0

Text alignment finds application in tasks such as citation recommendation and plagiarism detection. Existing alignment methods operate at a single, predefined level and cannot learn to align texts at, for example, sentence and document levels. We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component, enabling structural comparisons across different levels (document-to-document and sentence-to-document). Our component is weakly supervised from document pairs and can align at multiple levels. Our evaluation on predicting document-to-document relationships and sentence-to-document relationships on the tasks of citation recommendation and plagiarism detection shows that our approach outperforms previously established hierarchical, attention encoders based on recurrent and transformer contextualization that are unaware of structural correspondence between documents.

READ FULL TEXT
research
01/02/2021

Cross-Document Language Modeling

We introduce a new pretraining approach for language models that are gea...
research
04/15/2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Representation learning is a critical ingredient for natural language pr...
research
08/16/2022

Parallel Hierarchical Transformer with Attention Alignment for Abstractive Multi-Document Summarization

In comparison to single-document summarization, abstractive Multi-Docume...
research
04/30/2020

Exploiting Sentence Order in Document Alignment

In this work, we exploit the simple idea that a document and its transla...
research
10/29/2019

Big Bidirectional Insertion Representations for Documents

The Insertion Transformer is well suited for long form text generation d...
research
10/11/2022

Towards Structure-aware Paraphrase Identification with Phrase Alignment Using Sentence Encoders

Previous works have demonstrated the effectiveness of utilising pre-trai...
research
12/02/2021

Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-based Reranking

The goal of local citation recommendation is to recommend a missing refe...

Please sign up or login with your details

Forgot password? Click here to reset