Visual Acoustic Matching

02/14/2022
by   Changan Chen, et al.
0

We introduce the visual acoustic matching task, in which an audio clip is transformed to sound like it was recorded in a target environment. Given an image of the target environment and a waveform for the source audio, the goal is to re-synthesize the audio to match the target room acoustics as suggested by its visible geometry and materials. To address this novel task, we propose a cross-modal transformer model that uses audio-visual attention to inject visual properties into the audio and generate realistic audio output. In addition, we devise a self-supervised training objective that can learn acoustic matching from in-the-wild Web videos, despite their lack of acoustically mismatched audio. We demonstrate that our approach successfully translates human speech to a variety of real-world environments depicted in images, outperforming both traditional acoustic matching and more heavily supervised baselines.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 8

research
07/27/2023

Self-Supervised Visual Acoustic Matching

Acoustic matching aims to re-synthesize an audio clip to sound as if it ...
research
06/08/2022

Few-Shot Audio-Visual Learning of Environment Acoustics

Room impulse response (RIR) functions capture how the surrounding physic...
research
08/23/2023

AdVerb: Visually Guided Audio Dereverberation

We present AdVerb, a novel audio-visual dereverberation framework that u...
research
01/20/2023

Novel-View Acoustic Synthesis

We introduce the novel-view acoustic synthesis (NVAS) task: given the si...
research
06/16/2022

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based a...
research
05/22/2023

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

Text-to-speech(TTS) has undergone remarkable improvements in performance...
research
03/26/2021

Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis

Measuring the acoustic characteristics of a space is often done by captu...

Please sign up or login with your details

Forgot password? Click here to reset