Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality

08/16/2019
by   Aakanksha Rana, et al.
0

Ambisonics i.e., a full-sphere surround sound, is quintessential with 360-degree visual content to provide a realistic virtual reality (VR) experience. While 360-degree visual content capture gained a tremendous boost recently, the estimation of corresponding spatial sound is still challenging due to the required sound-field microphones or information about the sound-source locations. In this paper, we introduce a novel problem of generating Ambisonics in 360-degree videos using the audio-visual cue. With this aim, firstly, a novel 360-degree audio-visual video dataset of 265 videos is introduced with annotated sound-source locations. Secondly, a pipeline is designed for an automatic Ambisonic estimation problem. Benefiting from the deep learning-based audio-visual feature-embedding and prediction modules, our pipeline estimates the 3D sound-source locations and further use such locations to encode to the B-format. To benchmark our dataset and pipeline, we additionally propose evaluation criteria to investigate the performance using different 360-degree input representations. Our results demonstrate the efficacy of the proposed pipeline and open up a new area of research in 360-degree audio-visual analysis for future investigations.

READ FULL TEXT

page 2

page 4

research
12/04/2017

Visual to Sound: Generating Natural Sound for Videos in the Wild

As two of the five traditional human senses (sight, hearing, taste, smel...
research
01/07/2019

Visual Distortions in 360-degree Videos

Omnidirectional (or 360-degree) images and videos are emergent signals i...
research
07/20/2021

FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos

Deep learning based visual to sound generation systems essentially need ...
research
04/13/2021

Visually Informed Binaural Audio Generation without Binaural Audios

Stereophonic audio, especially binaural audio, plays an essential role i...
research
03/01/2020

Deep Learning for Content-based Personalized Viewport Prediction of 360-Degree VR Videos

In this paper, the problem of head movement prediction for virtual reali...
research
01/17/2023

DIGITOUR: Automatic Digital Tours for Real-Estate Properties

A virtual or digital tour is a form of virtual reality technology which ...
research
04/17/2023

Conditional Generation of Audio from Video via Foley Analogies

The sound effects that designers add to videos are designed to convey a ...

Please sign up or login with your details

Forgot password? Click here to reset