Information Retrieval from the Digitized Books

12/02/2022
by   Riya Gupta, et al.
0

Extracting the relevant information out of a large number of documents is a challenging and tedious task. The quality of results generated by the traditionally available full-text search engine and text-based image retrieval systems is not optimal. Information retrieval (IR) tasks become more challenging with the nontraditional language scripts, as in the case of Indic scripts. The authors have developed OCR (Optical Character Recognition) Search Engine to make an Information Retrieval Extraction (IRE) system that replicates the current state-of-the-art methods using the IRE and Natural Language Processing (NLP) techniques. Here we have presented the study of the methods used for performing search and retrieval tasks. The details of this system, along with the statistics of the dataset (source: National Digital Library of India or NDLI), is also presented. Additionally, the ideas to further explore and add value to research in IRE are also discussed.

READ FULL TEXT

page 2

page 4

research
08/28/2018

MIaS: Math-Aware Retrieval in Digital Mathematical Libraries

Digital mathematical libraries (DMLs) such as arXiv, Numdam, and EuDML c...
research
03/26/2020

Real-time information retrieval from Identity cards

Information is frequently retrieved from valid personal ID cards by the ...
research
01/04/2018

Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media

Daily engagement in life experiences is increasingly interwoven with mob...
research
12/27/2020

PatentMatch: A Dataset for Matching Patent Claims Prior Art

Patent examiners need to solve a complex information retrieval task when...
research
10/05/2021

Voice Information Retrieval In Collaborative Information Seeking

Voice information retrieval is a technique that provides Information Ret...
research
09/29/1998

Using Local Optimality Criteria for Efficient Information Retrieval with Redundant Information Filters

We consider information retrieval when the data, for instance multimedia...

Please sign up or login with your details

Forgot password? Click here to reset