End-to-End Probabilistic Inference for Nonstationary Audio Analysis

01/31/2019
by   William J. Wilkinson, et al.
0

A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model's state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering.

READ FULL TEXT

page 1

page 8

research
11/06/2018

Unifying Probabilistic Models for Time-Frequency Analysis

In audio signal processing, probabilistic time-frequency models have man...
research
11/03/2020

Quasi Monte Carlo Time-Frequency Analysis

We study signal processing tasks in which the signal is mapped via some ...
research
05/19/2017

Efficient Learning of Harmonic Priors for Pitch Detection in Polyphonic Music

Automatic music transcription (AMT) aims to infer a latent symbolic repr...
research
08/14/2023

Compositional nonlinear audio signal processing with Volterra series

We develop a compositional theory of nonlinear audio signal processing b...
research
07/12/2020

State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes

We formulate approximate Bayesian inference in non-conjugate temporal an...
research
01/24/2023

Mesostructures: Beyond Spectrogram Loss in Differentiable Time-Frequency Analysis

Computer musicians refer to mesostructures as the intermediate levels of...
research
03/19/2023

Multiscale Audio Spectrogram Transformer for Efficient Audio Classification

Audio event has a hierarchical architecture in both time and frequency a...

Please sign up or login with your details

Forgot password? Click here to reset