Hidden-Markov-Model Based Speech Enhancement

07/04/2017
by   Daniel Dzibela, et al.
0

The goal of this contribution is to use a parametric speech synthesis system for reducing background noise and other interferences from recorded speech signals. In a first step, Hidden Markov Models of the synthesis system are trained. Two adequate training corpora consisting of text and corresponding speech files have been set up and cleared of various faults, including inaudible utterances or incorrect assignments between audio and text data. Those are tested and compared against each other regarding e.g. flaws in the synthesized speech, it's naturalness and intelligibility. Thus different voices have been synthesized, whose quality depends less on the number of training samples used, but much more on the cleanliness and signal-to-noise ratio of those. Generalized voice models have been used for synthesis and the results greatly differ between the two speech corpora. Tests regarding the adaptation to different speakers show that a resemblance to the original speaker is audible throughout all recordings, yet the synthesized voices sound robotic and unnatural in smaller parts. The spoken text, however, is usually intelligible, which shows that the models are working well. In a novel approach, speech is synthesized using side information of the original audio signal, particularly the pitch frequency. Results show an increase of speech quality and intelligibility in comparison to speech synthesized solely from text, up to the point of being nearly indistinguishable from the original.

READ FULL TEXT
research
06/26/2019

RUSLAN: Russian Spoken Language Corpus for Speech Synthesis

We present RUSLAN -- a new open Russian spoken language corpus for the t...
research
05/29/2019

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

When speaking in presence of background noise, humans reflexively change...
research
02/02/2021

Generacion de voces artificiales infantiles en castellano con acento costarricense

This article evaluates a first experience of generating artificial child...
research
02/13/2023

Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

Hidden-Markov-model (HMM) based text-to-speech (HTS) offers flexibility ...
research
07/15/2018

Syllabification by Phone Categorization

Syllables play an important role in speech synthesis, speech recognition...
research
05/03/2022

Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis

Synthesized speech is common today due to the prevalence of virtual assi...
research
08/13/2020

Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion

The increased adoption of digital assistants makes text-to-speech (TTS) ...

Please sign up or login with your details

Forgot password? Click here to reset