Advancing Full-Text Search Lemmatization Techniques with Paradigm Retrieval from OpenCorpora

05/18/2023
by   Dmitriy Kalugin-Balashov, et al.
0

In this paper, we unveil a groundbreaking method to amplify full-text search lemmatization, utilizing the OpenCorpora dataset and a bespoke paradigm retrieval algorithm. Our primary aim is to streamline the extraction of a word's primary form or lemma - a crucial factor in full-text search. Additionally, we propose a compact dictionary storage strategy, significantly boosting the speed and precision of lemma retrieval.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

Neural Extractive Search

Domain experts often need to extract structured information from large c...
research
07/11/2022

Topic-Grained Text Representation-based Model for Document Retrieval

Document retrieval enables users to find their required documents accura...
research
07/18/2020

About a structure of easily updatable full-text indexes

We consider strategies to organize easily updatable associative arrays i...
research
03/10/2023

Semantic-Preserving Augmentation for Robust Image-Text Retrieval

Image text retrieval is a task to search for the proper textual descript...
research
02/06/2023

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval

Image-text retrieval (ITR) is a task to retrieve the relevant images/tex...
research
09/02/2019

Know2Look: Commonsense Knowledge for Visual Search

With the rise in popularity of social media, images accompanied by conte...
research
12/14/2021

Boosted Dense Retriever

We propose DrBoost, a dense retrieval ensemble inspired by boosting. DrB...

Please sign up or login with your details

Forgot password? Click here to reset