Speech vocoding for laboratory phonology

01/22/2016
by   Milos Cernak, et al.
0

Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85 envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications.

READ FULL TEXT
research
11/11/2020

WaDeNet: Wavelet Decomposition based CNN for Speech Processing

Existing speech processing systems consist of different modules, individ...
research
01/03/2017

Unsupervised neural and Bayesian models for zero-resource speech processing

In settings where only unlabelled speech data is available, zero-resourc...
research
08/15/2022

Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0

Neural network-based Text-to-Speech has significantly improved the quali...
research
05/30/2020

Exploring Filterbank Learning for Keyword Spotting

Despite their great performance over the years, handcrafted speech featu...
research
03/01/2023

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations

Text-to-speech (TTS) systems are modelled as mel-synthesizers followed b...
research
05/18/2022

Macedonian Speech Synthesis for Assistive Technology Applications

Speech technology is becoming ever more ubiquitous with the advance of s...

Please sign up or login with your details

Forgot password? Click here to reset