Native Language Identification on Text and Speech

07/22/2017
by   Marcos Zampieri, et al.
0

This paper presents an ensemble system combining the output of multiple SVM classifiers to native language identification (NLI). The system was submitted to the NLI Shared Task 2017 fusion track which featured students essays and spoken responses in form of audio transcriptions and iVectors by non-native English speakers of eleven native languages. Our system competed in the challenge under the team name ZCD and was based on an ensemble of SVM classifiers trained on character n-grams achieving 83.58 3rd in the shared task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2021

Non-native English lexicon creation for bilingual speech synthesis

Bilingual English speakers speak English as one of their languages. Thei...
research
03/19/2017

Native Language Identification using Stacked Generalization

Ensemble methods using multiple classifiers have proven to be the most s...
research
08/27/2021

Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin

A vast majority of the world's 7,000 spoken languages are predicted to b...
research
07/26/2017

Can string kernels pass the test of time in Native Language Identification?

We describe a machine learning approach for the 2017 shared task on Nati...
research
09/13/2023

Native Language Identification with Big Bird Embeddings

Native Language Identification (NLI) intends to classify an author's nat...
research
05/11/2020

Luganda Text-to-Speech Machine

In Uganda, Luganda is the most spoken native language. It is used for co...
research
07/09/2018

Discriminating between Indo-Aryan Languages Using SVM Ensembles

In this paper we present a system based on SVM ensembles trained on char...

Please sign up or login with your details

Forgot password? Click here to reset