Sanskrit Sandhi Splitting using seq2(seq)^2

01/01/2018
by   Neelamadhav Gantayat, et al.
0

In Sanskrit, small words (morphemes) are combined through a morphophonological process called Sandhi to form compound words. Sandhi splitting is the process of splitting a given compound word into its constituent morphemes. Although rules governing the splitting of words exist, it is highly challenging to identify the location of the splits in a compound word, as the same compound word might be broken down in multiple ways to provide syntactically correct splits. where the split(s) occur, as the same compound word might be broken down in multiple ways to provide partly correct splits. Existing systems explore incorporating these pre-defined splitting rules, but have low accuracy since they don't address the fundamental problem of identifying the split location. With this work, we propose a novel Double Decoder RNN (DD-RNN) architecture which i) predicts the location of the split(s) with an accuracy of 95% and ii) predicts the constituent words (i.e. learning the Sandhi splitting rules) with an accuracy of 79.5%. To the best of our knowledge, deep learning techniques have never been applied to the Sandhi splitting problem before. We further demonstrate that our model out-performs the previous state-of-the-art significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2020

Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language

This paper describes neural network based approaches to the process of t...
research
04/16/2020

Kvistur 2.0: a BiLSTM Compound Splitter for Icelandic

In this paper, we present a character-based BiLSTM model for splitting I...
research
10/13/2022

A New Optimality Property of Strang's Splitting

For systems of the form q̇ = M^-1 p, ṗ = -Aq+f(q), common in many applic...
research
10/15/2019

Modified Strang splitting for semilinear parabolic problems

We consider applying the Strang splitting to semilinear parabolic proble...
research
09/23/2022

I-SPLIT: Deep Network Interpretability for Split Computing

This work makes a substantial step in the field of split computing, i.e....
research
02/12/2020

To Split or Not to Split: The Impact of Disparate Treatment in Classification

Disparate treatment occurs when a machine learning model produces differ...
research
11/25/2016

Kannada Spell Checker with Sandhi Splitter

Spelling errors are introduced in text either during typing, or when the...

Please sign up or login with your details

Forgot password? Click here to reset