Sound-to-Imagination: Unsupervised Crossmodal Translation Using Deep Dense Network Architecture

by   Leonardo A. Fanzeres, et al.

The motivation of our research is to develop a sound-to-image (S2I) translation system for enabling a human receiver to visually infer the occurrence of sound related events. We expect the computer to 'imagine' the scene from the captured sound, generating original images that picture the sound emitting source. Previous studies on similar topics opted for simplified approaches using data with low content diversity and/or strong supervision. Differently, we propose to perform unsupervised S2I translation using thousands of distinct and unknown scenes, with slightly pre-cleaned data, just enough to guarantee aural-visual semantic coherence. To that end, we employ conditional generative adversarial networks (GANs) with a deep densely connected generator. Besides, we implemented a moving-average adversarial loss to address GANs training instability. Though the specified S2I translation problem is quite challenging, we were able to generalize the translator model enough to obtain more than 14 translated from unknown sounds. Additionally, we present a solution using informativity classifiers to perform quantitative evaluation of S2I translation.



There are no comments yet.


page 6

page 8

page 9

page 13

page 14

page 15

page 16

page 17


Towards Audio to Scene Image Synthesis using Generative Adversarial Network

Humans can imagine a scene from a sound. We want machines to do so by us...

On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation

Generative Adversarial Networks (GANs) are now widely used for photo-rea...

Structured GANs

We present Generative Adversarial Networks (GANs), in which the symmetri...

Learning to Localize Sound Source in Visual Scenes

Visual events are usually accompanied by sounds in our daily lives. We p...

Unsupervised Object Localization using Generative Adversarial Networks

This paper introduces a novel end-to-end deep neural network model for u...

Information Compensation for Deep Conditional Generative Networks

In recent years, unsupervised/weakly-supervised conditional generative a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.