Multi-level Attention Model for Weakly Supervised Audio Classification

03/06/2018
by   Changsong Yu, et al.
0

In this paper, we propose a multi-level attention model to solve the weakly labelled audio classification problem. The objective of audio classification is to predict the presence or absence of audio events in an audio clip. Recently, Google published a large scale weakly labelled dataset called Audio Set, where each audio clip contains only the presence or absence of the audio events, without the onset and offset time of the audio events. Our multi-level attention model is an extension to the previously proposed single-level attention model. It consists of several attention modules applied on intermediate neural network layers. The output of these attention modules are concatenated to a vector followed by a multi-label classifier to make the final prediction of each class. Experiments shown that our model achieves a mean average precision (mAP) of 0.360, outperforms the state-of-the-art single-level attention model of 0.327 and Google baseline of 0.314.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2019

Weakly labelled AudioSet Classification with Attention Neural Networks

Audio tagging is the task of predicting the presence or absence of sound...
research
11/21/2019

An End-to-End Audio Classification System based on Raw Waveforms and Mix-Training Strategy

Audio classification can distinguish different kinds of sounds, which is...
research
11/12/2019

Segment Relevance Estimation for Audio Analysis and Weakly-Labelled Classification

We propose a method that quantifies the importance, namely relevance, of...
research
09/03/2019

Multi-level Attention network using text, audio and video for Depression Prediction

Depression has been the leading cause of mental-health illness worldwide...
research
12/02/2019

Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

We tackle the task of environmental event classification by drawing insp...
research
02/02/2021

PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation

Audio event classification is an active research area and has a wide ran...
research
08/07/2020

A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling

This paper proposes a network architecture mainly designed for audio tag...

Please sign up or login with your details

Forgot password? Click here to reset