Towards an automatic recognition of mixed languages: The Ukrainian-Russian hybrid language Surzhyk

12/18/2019
by   Nataliya Sira, et al.
0

Language interference is common in today's multilingual societies where more languages are being in contact and as a global final result leads to the creation of hybrid languages. These, together with doubts on their right to be officially recognised made emerge in the area of computational linguistics the problem of their automatic identification and further elaboration. In this paper, we propose a first attempt to identify the elements of a Ukrainian-Russian hybrid language, Surzhyk, through the adoption of the example-based rules created with the instruments of programming language R. Our example-based study consists of: 1) analysis of spoken samples of Surzhyk registered by Del Gaudio (2010) in Kyiv area and creation of the written corpus; 2) production of specific rules on the identification of Surzhyk patterns and their implementation; 3) testing the code and analysing the effectiveness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2020

Exploiting Spectral Augmentation for Code-Switched Spoken Language Identification

Spoken language Identification (LID) systems are needed to identify the ...
research
03/26/2018

Automatic Identification of Closely-related Indian Languages: Resources and Experiments

In this paper, we discuss an attempt to develop an automatic language id...
research
05/19/2022

Automatic Spoken Language Identification using a Time-Delay Neural Network

Closed-set spoken language identification is the task of recognizing the...
research
10/22/2020

Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification

Deep neural networks have been employed for various spoken language reco...
research
06/17/2023

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Correctly identifying multiword expressions (MWEs) is an important task ...
research
06/19/2019

Towards Lakosian Multilingual Software Design Principles

Large software systems often comprise programs written in different prog...
research
05/22/2023

Automatic Spell Checker and Correction for Under-represented Spoken Languages: Case Study on Wolof

This paper presents a spell checker and correction tool specifically des...

Please sign up or login with your details

Forgot password? Click here to reset