Increase Apparent Public Speaking Fluency By Speech Augmentation

12/09/2018
by   Sagnik Das, et al.
0

Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve this task by manipulating the disfluencies in human speech, like the sounds 'uh' and 'um', the filler words and awkward long silences. Given any unrehearsed speech we segment and silence the filled pauses and doctor the duration of imposed silence as well as other long pauses ('disfluent') by a predictive model learned using professional speech dataset. Finally, we output a audio stream in which speaker sounds more fluent, confident and practiced compared to the original speech he/she recorded. According to our quantitative evaluation, we significantly increase the fluency of speech by reducing rate of pauses and fillers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2022

Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion

In most of practical scenarios, the announcement system must deliver spe...
research
10/22/2020

How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers?

We have been working on speech synthesis for rakugo (a traditional Japan...
research
10/15/2021

Neural Dubber: Dubbing for Videos According to Scripts

Dubbing is a post-production process of re-recording actors' dialogues, ...
research
03/10/2022

KSoF: The Kassel State of Fluency Dataset – A Therapy Centered Dataset of Stuttering

Stuttering is a complex speech disorder that negatively affects an indiv...
research
06/08/2021

Speech BERT Embedding For Improving Prosody in Neural TTS

This paper presents a speech BERT model to extract embedded prosody info...
research
09/01/2023

The FruitShell French synthesis system at the Blizzard 2023 Challenge

This paper presents a French text-to-speech synthesis system for the Bli...

Please sign up or login with your details

Forgot password? Click here to reset