North Sámi Dialect Identification with Self-supervised Speech Models

05/19/2023
by   Sofoklis Kakouros, et al.
0

The North Sámi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary. The unique geopolitical location of NS speakers means that in many cases they are bilingual in Sámi as well as in the dominant state language: Norwegian, Swedish, or Finnish. This enables us to study the NS variants both with respect to the spoken state language and their acoustic characteristics. In this paper, we investigate an extensive set of acoustic features, including MFCCs and prosodic features, as well as state-of-the-art self-supervised representations, namely, XLS-R, WavLM, and HuBERT, for the automatic detection of the four NS variants. In addition, we examine how the majority state language is reflected in the dialects. Our results show that NS dialects are influenced by the state language and that the four dialects are separable, reaching high classification accuracy, especially with the XLS-R model.

READ FULL TEXT
research
11/09/2022

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models

In this paper, we extend previous self-supervised approaches for languag...
research
05/27/2022

Self-supervised models of audio effectively explain human cortical responses to speech

Self-supervised language models are very effective at predicting high-le...
research
11/16/2020

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Transformer networks and self-supervised pre-training have consistently ...
research
09/30/2022

On The Robustness of Self-Supervised Representations for Spoken Language Modeling

Self-supervised representations have been extensively studied for discri...
research
10/28/2022

Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models

Given the strong results of self-supervised models on various tasks, the...
research
06/02/2023

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

Self-supervised techniques for learning speech representations have been...
research
05/31/2022

Do self-supervised speech models develop human-like perception biases?

Self-supervised models for speech processing form representational space...

Please sign up or login with your details

Forgot password? Click here to reset