On the Feasibility of Automated Detection of Allusive Text Reuse

05/08/2019
by   Enrique Manjavacas, et al.
0

The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2023

Information Retrieval in long documents: Word clustering approach for improving Semantics

In this paper, we propose an alternative to deep neural networks for sem...
research
04/29/2020

Complementing Lexical Retrieval with Semantic Residual Embedding

Information retrieval traditionally has relied on lexical matching signa...
research
07/15/2018

WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features

Text search based on lexical matching of keywords is not satisfactory du...
research
06/23/2016

Toward a Deep Neural Approach for Knowledge-Based IR

This paper tackles the problem of the semantic gap between a document an...
research
02/27/2017

A Knowledge-Based Approach to Word Sense Disambiguation by distributional selection and semantic features

Word sense disambiguation improves many Natural Language Processing (NLP...
research
06/15/2017

DSRIM: A Deep Neural Information Retrieval Model Enhanced by a Knowledge Resource Driven Representation of Documents

The state-of-the-art solutions to the vocabulary mismatch in information...
research
10/10/2020

When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models

We address hypernymy detection, i.e., whether an is-a relationship exist...

Please sign up or login with your details

Forgot password? Click here to reset