Penambahan emosi menggunakan metode manipulasi prosodi untuk sistem text to speech bahasa Indonesia

06/29/2016
by   Salita Ulitia Prini, et al.
0

Adding an emotions using prosody manipulation method for Indonesian text to speech system. Text To Speech (TTS) is a system that can convert text in one language into speech, accordance with the reading of the text in the language used. The focus of this research is a natural sounding concept, the make "humanize" for the pronunciation of voice synthesis system Text To Speech. Humans have emotions / intonation that may affect the sound produced. The main requirement for the system used Text To Speech in this research is eSpeak, the database MBROLA using id1, Human Speech Corpus database from a website that summarizes the words with the highest frequency (Most Common Words) used in a country. And there are 3 types of emotional / intonation designed base. There is a happy, angry and sad emotion. Method for develop the emotional filter is manipulate the relevant features of prosody (especially pitch and duration value) using a predetermined rate factor that has been established by analyzing the differences between the standard output Text To Speech and voice recording with emotional prosody / a particular intonation. The test results for the perception tests of Human Speech Corpus for happy emotion is 95 angry emotion and 98.75 by intelligibility and naturalness test. Intelligibility test for the accuracy of sound with the original sentence is 93.3 sentence is 62.8 75.6 ----- Text To Speech (TTS) merupakan suatu sistem yang dapat mengonversi teks dalam format suatu bahasa menjadi ucapan sesuai dengan pembacaan teks dalam bahasa yang digunakan.

READ FULL TEXT

page 1

page 3

page 4

page 5

research
06/28/2023

EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech

State-of-the-art speech synthesis models try to get as close as possible...
research
02/09/2023

Robot Synesthesia: A Sound and Emotion Guided AI Painter

If a picture paints a thousand words, sound may voice a million. While r...
research
09/22/2017

Techniques and Challenges in Speech Synthesis

The aim of this project was to develop and implement an English language...
research
11/07/2020

Naturalization of Text by the Insertion of Pauses and Filler Words

In this article, we introduce a set of methods to naturalize text based ...
research
02/24/2020

Emosaic: Visualizing Affective Content of Text at Varying Granularity

This paper presents Emosaic, a tool for visualizing the emotional tone o...
research
05/12/2020

A computational model implementing subjectivity with the 'Room Theory'. The case of detecting Emotion from Text

This work introduces a new method to consider subjectivity and general c...
research
06/25/2018

The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems

In this paper, we present a database of emotional speech intended to be ...

Please sign up or login with your details

Forgot password? Click here to reset