An Investigation of the Effectiveness of Phase for Audio Classification

10/06/2021
by   Shunsuke Hidaka, et al.
0

While log-amplitude mel-spectrogram has widely been used as the feature representation for processing speech based on deep learning, the effectiveness of another aspect of speech spectrum, i.e., phase information, was shown recently for tasks such as speech enhancement and source separation. In this study, we extensively investigated the effectiveness of including phase information of signals for eight audio classification tasks. We constructed a learnable front-end that can compute the phase and its derivatives based on a time-frequency representation with mel-like frequency axis. As a result, experimental results showed significant performance improvement for musical pitch detection, musical instrument detection, language identification, speaker identification, and birdsong detection. On the other hand, overfitting to the recording condition was observed for some tasks when the instantaneous frequency was used. The results implied that the relationship between the phase values of adjacent elements is more important than the phase itself in audio classification.

READ FULL TEXT
research
05/27/2021

An Improved Measure of Musical Noise Based on Spectral Kurtosis

Audio processing methods operating on a time-frequency representation of...
research
02/24/2022

Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement

Modern neural speech enhancement models usually include various forms of...
research
07/07/2018

Improving DNN-based Music Source Separation using Phase Features

Music source separation with deep neural networks typically relies only ...
research
03/22/2020

Audio Impairment Recognition Using a Correlation-Based Feature Representation

Audio impairment recognition is based on finding noise in audio files an...
research
08/02/2022

Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features

Recently, pioneer research works have proposed a large number of acousti...
research
02/13/2021

Deep Convolutional and Recurrent Networks for Polyphonic Instrument Classification from Monophonic Raw Audio Waveforms

Sound Event Detection and Audio Classification tasks are traditionally a...
research
07/12/2022

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

In audio classification, differentiable auditory filterbanks with few pa...

Please sign up or login with your details

Forgot password? Click here to reset