Sound Localization by Self-Supervised Time Delay Estimation

04/26/2022
by   Ziyang Chen, et al.
0

Sounds reach one microphone in a stereo pair sooner than the other, resulting in an interaural time delay that conveys their directions. Estimating a sound's time delay requires finding correspondences between the signals recorded by each microphone. We propose to learn these correspondences through self-supervision, drawing on recent techniques from visual tracking. We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings. We also propose a multimodal contrastive learning model that solves a visually-guided localization task: estimating the time delay for a particular person in a multi-speaker mixture, given a visual representation of their face. Project site: https://ificl.github.io/stereocrw/

READ FULL TEXT

page 1

page 11

research
11/28/2022

Mix and Localize: Localizing Sound Sources in Mixtures

We present a method for simultaneously localizing multiple sound sources...
research
11/03/2022

MarginNCE: Robust Sound Localization with a Negative Margin

The goal of this work is to localize sound sources in visual scenes with...
research
03/20/2023

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

The images and sounds that we perceive undergo subtle but geometrically ...
research
10/25/2019

Self-supervised Moving Vehicle Tracking with Stereo Sound

Humans are able to localize objects in the environment using both visual...
research
01/20/2022

Learning Pixel Trajectories with Multiscale Contrastive Random Walks

A range of video modeling tasks, from optical flow to multiple object tr...
research
07/09/2018

Deep Co-Clustering for Unsupervised Audiovisual Learning

The seen birds twitter, the running cars accompany with noise, people ta...

Please sign up or login with your details

Forgot password? Click here to reset