Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

05/21/2023
by   Detai Xin, et al.
0

We present a large-scale in-the-wild Japanese laughter corpus and a laughter synthesis method. Previous work on laughter synthesis lacks not only data but also proper ways to represent laughter. To solve these problems, we first propose an in-the-wild corpus comprising 3.5 hours of laughter, which is to our best knowledge the largest laughter corpus designed for laughter synthesis. We then propose pseudo phonetic tokens (PPTs) to represent laughter by a sequence of discrete tokens, which are obtained by training a clustering model on features extracted from laughter by a pretrained self-supervised model. Laughter can then be synthesized by feeding PPTs into a text-to-speech system. We further show PPTs can be used to train a language model for unconditional laughter generation. Results of comprehensive subjective and objective evaluations demonstrate that the proposed method significantly outperforms a baseline method, and can generate natural laughter unconditionally.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2023

Can large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end?

A speech spoofing countermeasure (CM) that discriminates between unseen ...
research
09/14/2023

Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks

We propose a decoder-only language model, VoxtLM, that can perform four ...
research
10/28/2017

JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis

Thanks to improvements in machine learning techniques including deep lea...
research
09/14/2023

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Self-supervised learning (SSL) proficiency in speech-related tasks has d...
research
03/01/2022

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

Training a text-to-image generator in the general domain (e.g., Dall.e, ...
research
01/08/2019

Computational Register Analysis and Synthesis

The study of register in computational language research has historicall...
research
07/12/2023

A Study on the Appropriate size of the Mongolian general corpus

This study aims to determine the appropriate size of the Mongolian gener...

Please sign up or login with your details

Forgot password? Click here to reset