BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network

09/06/2023
by   Takashi Shibuya, et al.
0

Generative adversarial network (GAN)-based vocoders have been intensively studied because they can synthesize high-fidelity audio waveforms faster than real-time. However, it has been reported that most GANs fail to obtain the optimal projection for discriminating between real and fake data in the feature space. In the literature, it has been demonstrated that slicing adversarial network (SAN), an improved GAN training framework that can find the optimal projection, is effective in the image generation task. In this paper, we investigate the effectiveness of SAN in the vocoding task. For this purpose, we propose a scheme to modify least-squares GAN, which most GAN-based vocoders adopt, so that their loss functions satisfy the requirements of SAN. Through our experiments, we demonstrate that SAN can improve the performance of GAN-based vocoders, including BigVGAN, with small modifications. Our code is available at https://github.com/sony/bigvsan.

READ FULL TEXT
research
06/04/2020

GAN-Based Facial Attractiveness Enhancement

We propose a generative framework based on generative adversarial networ...
research
06/18/2020

Differentiable Augmentation for Data-Efficient GAN Training

The performance of generative adversarial networks (GANs) heavily deteri...
research
11/26/2020

Omni-GAN: On the Secrets of cGANs and Beyond

It has been an important problem to design a proper discriminator for co...
research
07/20/2021

Establishing process-structure linkages using Generative Adversarial Networks

The microstructure of material strongly influences its mechanical proper...
research
06/09/2022

BigVGAN: A Universal Neural Vocoder with Large-Scale Training

Despite recent progress in generative adversarial network(GAN)-based voc...
research
07/20/2022

Difficulty-Aware Simulator for Open Set Recognition

Open set recognition (OSR) assumes unknown instances appear out of the b...
research
11/01/2021

RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses

Most GAN(Generative Adversarial Network)-based approaches towards high-f...

Please sign up or login with your details

Forgot password? Click here to reset