Characterisation of speech diversity using self-organising maps

01/23/2017
by   Tom A. F. Anderson, et al.
0

We report investigations into speaker classification of larger quantities of unlabelled speech data using small sets of manually phonemically annotated speech. The Kohonen speech typewriter is a semi-supervised method comprised of self-organising maps (SOMs) that achieves low phoneme error rates. A SOM is a 2D array of cells that learn vector representations of the data based on neighbourhoods. In this paper, we report a method to evaluate pronunciation using multilevel SOMs with /hVd/ single syllable utterances for the study of vowels, for Australian pronunciation.

READ FULL TEXT
research
05/16/2020

Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation

Recently, end-to-end multi-speaker text-to-speech (TTS) systems gain suc...
research
10/23/2018

Semi-supervised acoustic model training for speech with code-switching

In the FAME! project, we aim to develop an automatic speech recognition ...
research
05/20/2019

Target Based Speech Act Classification in Political Campaign Text

We study pragmatics in political campaign text, through analysis of spee...
research
04/01/2021

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

We propose using self-supervised discrete representations for the task o...
research
08/30/2019

On Laughter and Speech-Laugh, Based on Observations of Child-Robot Interaction

In this article, we study laughter found in child-robot interaction wher...
research
05/28/2023

Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS

Flow-based generative models are widely used in text-to-speech (TTS) sys...

Please sign up or login with your details

Forgot password? Click here to reset