Kannada Spell Checker with Sandhi Splitter

11/25/2016
by   A N Akshatha, et al.
0

Spelling errors are introduced in text either during typing, or when the user does not know the correct phoneme or grapheme. If a language contains complex words like sandhi where two or more morphemes join based on some rules, spell checking becomes very tedious. In such situations, having a spell checker with sandhi splitter which alerts the user by flagging the errors and providing suggestions is very useful. A novel algorithm of sandhi splitting is proposed in this paper. The sandhi splitter can split about 7000 most common sandhi words in Kannada language used as test samples. The sandhi splitter was integrated with a Kannada spell checker and a mechanism for generating suggestions was added. A comprehensive, platform independent, standalone spell checker with sandhi splitter application software was thus developed and tested extensively for its efficiency and correctness. A comparative analysis of this spell checker with sandhi splitter was made and results concluded that the Kannada spell checker with sandhi splitter has an improved performance. It is twice as fast, 200 times more space efficient, and it is 90 of complex nouns and 50 sandhi splitter will be of foremost significance in machine translation systems, voice processing, etc. This is the first sandhi splitter in Kannada and the advantage of the novel algorithm is that, it can be extended to all Indian languages.

READ FULL TEXT
research
05/16/2018

Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation

This paper provides a comparative analysis of the performance of four st...
research
12/07/2016

Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages

The advent of the attention mechanism in neural machine translation mode...
research
07/08/2015

Hindi to English Transfer Based Machine Translation System

In large societies like India there is a huge demand to convert one huma...
research
01/12/2020

Urdu-English Machine Transliteration using Neural Networks

Machine translation has gained much attention in recent years. It is a s...
research
01/01/2018

Sanskrit Sandhi Splitting using seq2(seq)^2

In Sanskrit, small words (morphemes) are combined through a morphophonol...
research
02/18/2021

Fixing Errors of the Google Voice Recognizer through Phonetic Distance Metrics

Speech recognition systems for the Spanish language, such as Google's, p...

Please sign up or login with your details

Forgot password? Click here to reset