Collaborative Learning to Generate Audio-Video Jointly

04/01/2021
by   Vinod K. Kurmi, et al.
21

There have been a number of techniques that have demonstrated the generation of multimedia data for one modality at a time using GANs, such as the ability to generate images, videos, and audio. However, so far, the task of multi-modal generation of data, specifically for audio and videos both, has not been sufficiently well-explored. Towards this, we propose a method that demonstrates that we are able to generate naturalistic samples of video and audio data by the joint correlated generation of audio and video modalities. The proposed method uses multiple discriminators to ensure that the audio, video, and the joint output are also indistinguishable from real-world samples. We present a dataset for this task and show that we are able to generate realistic samples. This method is validated using various standard metrics such as Inception Score, Frechet Inception Distance (FID) and through human evaluation.

READ FULL TEXT
research
02/19/2021

One Shot Audio to Animated Video Generation

We consider the challenging problem of audio to animated video generatio...
research
11/21/2020

Stochastic Talking Face Generation Using Latent Distribution Matching

The ability to envisage the visual of a talking face based just on heari...
research
01/11/2021

ArrowGAN : Learning to Generate Videos by Learning Arrow of Time

Training GANs on videos is even more sophisticated than on images becaus...
research
03/29/2023

Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation

As a combination of visual and audio signals, video is inherently multi-...
research
12/14/2020

Multi Modal Adaptive Normalization for Audio to Video Generation

Speech-driven facial video generation has been a complex problem due to ...
research
03/13/2019

Voice command generation using Progressive Wavegans

Generative Adversarial Networks (GANs) have become exceedingly popular i...
research
08/23/2023

An Initial Exploration: Learning to Generate Realistic Audio for Silent Video

Generating realistic audio effects for movies and other media is a chall...

Please sign up or login with your details

Forgot password? Click here to reset