An Adaptive Technique to Categorize Indic Language Documents

03/06/2022
by   Dulani Meedeniya, et al.
0

The significant growth of the electronic media to store and exchange text documents has led to the use of tools, which analyses and categorizes documents based on their content. The availability of full-text documents in electronic from emphasizes the need for intelligent information retrieval techniques. In Sri Lanka most of the public services use text documents written in Sinhala to provide their services. As a result, there is an essential need for a system which can be used to analyze and process documents in Sinhala. The main techniques examined in this study include data pre-processing and data clustering. The approach makes use of a transformation based on the text frequency, which enhance the clustering performance. This research provides an approach based on Latent Semantic Analysis to process text documents written in Sinhala, and empower citizens and organizations to do their daily work easily.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
03/06/2022

Evaluation of Partition-Based Text Clustering Techniques to Categorize Indic Language Documents

Wide availability of electronic data has led to the vast interest in tex...
research
03/06/2022

A Comparative Study on Data Representation to Categorize Text Documents

In the modern world text documents play an important role in most of the...
research
04/16/2020

An approach based on Combination of Features for automatic news retrieval

Nowadays, according to the increasingly increasing information, the impo...
research
06/04/2023

Using artificial-intelligence tools to make LaTeX content accessible to blind readers

Screen-reader software enables blind users to access large segments of e...
research
11/22/2019

SWift – A SignWriting editor to bridge between deaf world and e-learning

SWift (SignWriting improved fast transcriber) is an advanced editor for ...
research
10/05/2022

Intelligent Information Retrieval: Techniques for Character Recognition and Structured Data Extraction

The day-to-day activities of every corporation in-volve working with a h...
research
04/30/2021

Word-Level Alignment of Paper Documents with their Electronic Full-Text Counterparts

We describe a simple procedure for the automatic creation of word-level ...

Please sign up or login with your details

Forgot password? Click here to reset