Rule based Approach for Word Normalization by resolving Transcription Ambiguity in Transliterated Search Queries

10/16/2019
by   Varsha Pathak, et al.
0

Query term matching with document term matching is the basic function of any best effort Information Retrieval models like Vector Space Model. In our problem of SMS based Information Systems we expect common people to participate in information search. Our system allows mobile users to formulate their queries in their own words, own transliteration style and spelling formation. To achieve this flexibility we have resolved the term level ambiguity due to inherent transcription noise in user query terms. We have developed a rule based approach to select most relevantly close standard term for each noisy term in the user query. We have used four different versions of the rule based algorithm with variation in the rule set. We have formulated this rule set including the basic Levenshtein minimum edit distance algorithm for term matching. This paper presents the experiments and corresponding results of Marathi and Hindi language literature information system. We have experimented on Marathi and Hindi literature which include songs, gazals, powadas, bharud and other types in a standard transliteration form like ITRANS.

READ FULL TEXT
research
03/25/2015

A Rule-Based Short Query Intent Identification System

Using SMS (Short Message System), cell phones can be used to query for i...
research
11/16/2017

Remedies against the Vocabulary Gap in Information Retrieval

Search engines rely heavily on term-based approaches that represent quer...
research
08/19/2017

A rule based algorithm for detecting negative words in Persian

In this paper, we present a novel method for detecting negative words in...
research
07/11/2015

A new hybrid stemming algorithm for Persian

Stemming has been an influential part in Information retrieval and searc...
research
03/01/2021

Query Rewriting via Cycle-Consistent Translation for E-Commerce Search

Nowadays e-commerce search has become an integral part of many people's ...
research
05/09/2022

XSTEM: An exemplar-based stemming algorithm

Stemming is the process of reducing related words to a standard form by ...
research
03/03/2016

A knowledge representation meta-model for rule-based modelling of signalling networks

The study of cellular signalling pathways and their deregulation in dise...

Please sign up or login with your details

Forgot password? Click here to reset