Point Cloud Audio Processing

05/06/2021
by   Krishna Subramani, et al.
0

Most audio processing pipelines involve transformations that act on fixed-dimensional input representations of audio. For example, when using the Short Time Fourier Transform (STFT) the DFT size specifies a fixed dimension for the input representation. As a consequence, most audio machine learning models are designed to process fixed-size vector inputs which often prohibits the repurposing of learned models on audio with different sampling rates or alternative representations. We note, however, that the intrinsic spectral information in the audio signal is invariant to the choice of the input representation or the sampling rate. Motivated by this, we introduce a novel way of processing audio signals by treating them as a collection of points in feature space, and we use point cloud machine learning models that give us invariance to the choice of representation parameters, such as DFT size or the sampling rate. Additionally, we observe that these methods result in smaller models, and allow us to significantly subsample the input representation with minimal effects to a trained model performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2022

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates

Conventionally, audio super-resolution models fixed the initial and the ...
research
04/24/2019

A Robust Approach for Securing Audio Classification Against Adversarial Attacks

Adversarial audio attacks can be considered as a small perturbation unpe...
research
04/29/2021

Simulating the DFT Algorithm for Audio Processing

Since the evolution of digital computers, the storage of data has always...
research
03/20/2023

A Tiny Machine Learning Model for Point Cloud Object Classification

The design of a tiny machine learning model, which can be deployed in mo...
research
04/24/2023

Grad-PU: Arbitrary-Scale Point Cloud Upsampling via Gradient Descent with Learned Distance Functions

Most existing point cloud upsampling methods have roughly three steps: f...
research
10/25/2019

Learning audio representations via phase prediction

We learn audio representations by solving a novel self-supervised learni...
research
08/31/2018

Speaker Fluency Level Classification Using Machine Learning Techniques

Level assessment for foreign language students is necessary for putting ...

Please sign up or login with your details

Forgot password? Click here to reset