The Effect of the Multi-Layer Text Summarization Model on the Efficiency and Relevancy of the Vector Space-based Information Retrieval

04/18/2020
by   Ahmad Hussein Ababneh, et al.
0

The massive upload of text on the internet creates a huge inverted index in information retrieval systems, which hurts their efficiency. The purpose of this research is to measure the effect of the Multi-Layer Similarity model of the automatic text summarization on building an informative and condensed invert index in the IR systems. To achieve this purpose, we summarized a considerable number of documents using the Multi-Layer Similarity model, and we built the inverted index from the automatic summaries that were generated from this model. A series of experiments were held to test the performance in terms of efficiency and relevancy. The experiments include comparisons with three existing text summarization models; the Jaccard Coefficient Model, the Vector Space Model, and the Latent Semantic Analysis model. The experiments examined three groups of queries with manual and automatic relevancy assessment. The positive effect of the Multi-Layer Similarity in the efficiency of the IR system was clear without noticeable loss in the relevancy results. However, the evaluation showed that the traditional statistical models without semantic investigation failed to improve the information retrieval efficiency. Comparing with the previous publications that addressed the use of summaries as a source of the index, the relevancy assessment of our work was higher, and the Multi-Layer Similarity retrieval constructed an inverted index that was 58 smaller than the main corpus inverted index.

READ FULL TEXT
research
08/27/2020

MultiGBS: A multi-layer graph approach to biomedical summarization

Automatic text summarization methods generate a shorter version of the i...
research
01/11/2018

Applying Vector Space Model (VSM) Techniques in Information Retrieval for Arabic Language

Information Retrieval (IR) is a part of Neutral Language Processing (NLP...
research
11/19/2018

End-to-End Retrieval in Continuous Space

Most text-based information retrieval (IR) systems index objects by word...
research
09/15/2023

Encoded Summarization: Summarizing Documents into Continuous Vector Space for Legal Case Retrieval

We present our method for tackling a legal case retrieval task by introd...
research
12/17/2022

RISE: Leveraging Retrieval Techniques for Summarization Evaluation

Evaluating automatically-generated text summaries is a challenging task....
research
03/29/2021

TREC 2020 Podcasts Track Overview

The Podcast Track is new at the Text Retrieval Conference (TREC) in 2020...
research
04/22/2023

(Vector) Space is Not the Final Frontier: Product Search as Program Synthesis

As ecommerce continues growing, huge investments in ML and NLP for Infor...

Please sign up or login with your details

Forgot password? Click here to reset