A Topological Method for Comparing Document Semantics

12/08/2020
by   Yuqi Kong, et al.
0

Comparing document semantics is one of the toughest tasks in both Natural Language Processing and Information Retrieval. To date, on one hand, the tools for this task are still rare. On the other hand, most relevant methods are devised from the statistic or the vector space model perspectives but nearly none from a topological perspective. In this paper, we hope to make a different sound. A novel algorithm based on topological persistence for comparing semantics similarity between two documents is proposed. Our experiments are conducted on a document dataset with human judges' results. A collection of state-of-the-art methods are selected for comparison. The experimental results show that our algorithm can produce highly human-consistent results, and also beats most state-of-the-art methods though ties with NLTK.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2021

A Topological Approach to Compare Document Semantics Based on a New Variant of Syntactic N-grams

This paper delivers a new perspective of thinking and utilizing syntacti...
research
03/13/2018

Axiomatic systems and topological semantics for intuitionistic temporal logic

We propose four axiomatic systems for intuitionistic linear temporal log...
research
05/27/2018

Legal Document Retrieval using Document Vector Embeddings and Deep Learning

Domain specific information retrieval process has been a prominent and o...
research
04/22/2019

Water-Filling: An Efficient Algorithm for Digitized Document Shadow Removal

In this paper, we propose a novel algorithm to rectify illumination of t...
research
03/29/2020

Topological Data Analysis in Text Classification: Extracting Features with Additive Information

While the strength of Topological Data Analysis has been explored in man...
research
12/03/2014

A perspective on the advancement of natural language processing tasks via topological analysis of complex networks

Comment on "Approaching human language with complex networks" by Cong an...
research
12/15/2021

Value Retrieval with Arbitrary Queries for Form-like Documents

We propose value retrieval with arbitrary queries for form-like document...

Please sign up or login with your details

Forgot password? Click here to reset