DeepAI AI Chat
Log In Sign Up

Document Similarity for Texts of Varying Lengths via Hidden Topics

03/26/2019
by   Hongyu Gong, et al.
ibm
University of Illinois at Urbana-Champaign
0

Measuring similarity between texts is an important task for several applications. Available approaches to measure document similarity are inadequate for document pairs that have non-comparable lengths, such as a long document and its summary. This is because of the lexical, contextual and the abstraction gaps between a long document of rich details and its concise summary of abstract information. In this paper, we present a document matching approach to bridge this gap, by comparing the texts in a common space of hidden topics. We evaluate the matching algorithm on two matching tasks and find that it consistently and widely outperforms strong baselines. We also highlight the benefits of incorporating domain knowledge to text matching.

READ FULL TEXT
04/23/2019

Wasserstein-Fisher-Rao Document Distance

As a fundamental problem of natural language processing, it is important...
10/08/2022

EDU-level Extractive Summarization with Varying Summary Lengths

Extractive models usually formulate text summarization as extracting top...
03/01/2020

StructSum: Incorporating Latent and Explicit Sentence Dependencies for Single Document Summarization

Traditional preneural approaches to single document summarization relied...
12/06/2022

Document-Level Abstractive Summarization

The task of automatic text summarization produces a concise and fluent t...
07/08/2018

Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts

The goal of our industrial ticketing system is to retrieve a relevant so...
12/07/2020

An Enhanced MeanSum Method For Generating Hotel Multi-Review Summarizations

Multi-document summaritazion is the process of taking multiple texts as ...
12/02/2021

KPDrop: An Approach to Improving Absent Keyphrase Generation

Keyphrase generation is the task of generating phrases (keyphrases) that...