Discriminating between Indo-Aryan Languages Using SVM Ensembles

07/09/2018
by   Alina Maria Ciobanu, et al.
0

In this paper we present a system based on SVM ensembles trained on characters and words to discriminate between five similar languages of the Indo-Aryan family: Hindi, Braj Bhasha, Awadhi, Bhojpuri, and Magahi. We investigate the performance of individual features and combine the output of single classifiers to maximize performance. The system competed in the Indo-Aryan Language Identification (ILI) shared task organized within the VarDial Evaluation Campaign 2018. Our best entry in the competition, named ILIdentification, scored 88:95

READ FULL TEXT
research
07/22/2018

German Dialect Identification Using Classifier Ensembles

In this paper we present the GDI_classification entry to the second Germ...
research
04/27/2019

Experiments in Cuneiform Language Identification

This paper presents methods to discriminate between languages and dialec...
research
07/22/2017

Native Language Identification on Text and Speech

This paper presents an ensemble system combining the output of multiple ...
research
09/30/2016

Discriminating Similar Languages: Evaluations and Explorations

We present an analysis of the performance of machine learning classifier...
research
11/12/2018

Classifying Patent Applications with Ensemble Methods

We present methods for the automatic classification of patent applicatio...
research
05/31/2023

Findings of the VarDial Evaluation Campaign 2023

This report presents the results of the shared tasks organized as part o...
research
06/27/2023

Confidence-based Ensembles of End-to-End Speech Recognition Models

The number of end-to-end speech recognition models grows every year. The...

Please sign up or login with your details

Forgot password? Click here to reset