DeepAI AI Chat
Log In Sign Up

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity

by   Sheshera Mysore, et al.

We present Aspire, a new scientific document similarity model based on matching fine-grained aspects. Our model is trained using co-citation contexts that describe related paper aspects as a novel form of textual supervision. We use multi-vector document representations, recently explored in settings with short query texts but under-explored in the challenging document-document setting. We present a fast method that involves matching only single sentence pairs, and a method that makes sparse multiple matches with optimal transport. Our model improves performance on document similarity tasks across four datasets. Moreover, our fast single-match method achieves competitive results, opening up the possibility of applying fine-grained document similarity models to large-scale scientific corpora.


page 1

page 2

page 3

page 4


Document Layout Analysis with Aesthetic-Guided Image Augmentation

Document layout analysis (DLA) plays an important role in information ex...

arXivEdits: Understanding the Human Revision Process in Scientific Writing

Scientific publications are the primary means to communicate research di...

Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion

Natural language processing (NLP) task has achieved excellent performanc...

Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

Detecting fine-grained differences in content conveyed in different lang...

UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text

This demo paper presents UnScientify, an interactive system designed to ...

An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information

In this paper, we focus on the problem of unsupervised image-sentence ma...

Scaling Creative Inspiration with Fine-Grained Functional Facets of Product Ideas

Web-scale repositories of products, patents and scientific papers offer ...