Differentiable Time-Frequency Scattering in Kymatio

04/18/2022
by   John Muradeli, et al.
0

Joint time-frequency scattering (JTFS) is a convolutional operator in the time-frequency domain which extracts spectrotemporal modulations at various rates and scales. It offers an idealized model of spectrotemporal receptive fields (STRF) in the primary auditory cortex, and thus may serve as a biological plausible surrogate for human perceptual judgments at the scale of isolated audio events. Yet, prior implementations of JTFS and STRF have remained outside of the standard toolkit of perceptual similarity measures and evaluation methods for audio generation. We trace this issue down to three limitations: differentiability, speed, and flexibility. In this paper, we present an implementation of time-frequency scattering in Kymatio, an open-source Python package for scattering transforms. Unlike prior implementations, Kymatio accommodates NumPy and PyTorch as backends and is thus portable on both CPU and GPU. We demonstrate the usefulness of JTFS in Kymatio via three applications: unsupervised manifold learning of spectrotemporal modulations, supervised classification of musical instruments, and texture resynthesis of bioacoustic sounds.

READ FULL TEXT
research
10/10/2018

On Time-frequency Scattering and Computer Music

To appear as the preface to: "Florian Hecker: Halluzination, Perspektive...
research
01/24/2023

Mesostructures: Beyond Spectrogram Loss in Differentiable Time-Frequency Analysis

Computer musicians refer to mesostructures as the intermediate levels of...
research
07/21/2020

Time-Frequency Scattering Accurately Models Auditory Similarities Between Instrumental Playing Techniques

Instrumental playing techniques such as vibratos, glissandos, and trills...
research
06/21/2019

The Shape of RemiXXXes to Come: Audio Texture Synthesis with Time-frequency Scattering

This article explains how to apply time-frequency scattering, a convolut...
research
10/08/2021

Joint Scattering for Automatic Chick Call Recognition

Animal vocalisations contain important information about health, emotion...
research
11/02/2022

SpectroMap: Peak detection algorithm for audio fingerprinting

We present SpectroMap, an open source GitHub repository for audio finger...
research
07/20/2020

wav2shape: Hearing the Shape of a Drum Machine

Disentangling and recovering physical attributes, such as shape and mate...

Please sign up or login with your details

Forgot password? Click here to reset