Automatic generation of semantic corpora for improving intent estimation of taxonomy-driven search engines

03/30/2022
by   Lorenzo Massai, et al.
0

With the increasing demand of intelligent systems capable of operating in different user contexts (e.g. users on the move) the correct interpretation of the user-need by such systems has become crucial to give a consistent answer to the user query. The most effective techniques which are used to address such task are in the fields of natural language processing and semantic expansion of terms. Such systems are aimed at estimating the actual meaning of input queries, addressing the concepts of the words which are expressed within the user questions. The aim of this paper is to demonstrate which semantic relation impacts the most in semantic expansion-based retrieval systems and to identify the best tradeoff between accuracy and noise introduction when combining such relations. The evaluations are made building a simple natural language processing system capable of querying any taxonomy-driven domain, making use of the combination of different semantic expansions as knowledge resources. The proposed evaluation employs a wide and varied taxonomy as a use-case, exploiting its labels as basis for the expansions. To build the knowledge resources several corpora have been produced and integrated as gazetteers into the NLP infrastructure with the purpose of estimating the pseudo-queries corresponding to the taxonomy labels, considered as the possible intents.

READ FULL TEXT

page 7

page 9

research
04/23/2020

Natural language technology and query expansion: issues, state-of-the-art and perspectives

The availability of an abundance of knowledge sources has spurred a larg...
research
07/30/2021

Deep Natural Language Processing for LinkedIn Search Systems

Many search systems work with large amounts of natural language data, e....
research
07/07/2020

ISA: An Intelligent Shopping Assistant

Despite the growth of e-commerce, brick-and-mortar stores are still the ...
research
05/17/2020

On the Combined Use of Extrinsic Semantic Resources for Medical Information Search

Semantic concepts and relations encoded in domain-specific ontologies an...
research
05/02/2022

ORCAS-I: Queries Annotated with Intent using Weak Supervision

User intent classification is an important task in information retrieval...
research
09/25/2012

Semi-automatic annotation process for procedural texts: An application on cooking recipes

Taaable is a case-based reasoning system that adapts cooking recipes to ...

Please sign up or login with your details

Forgot password? Click here to reset