Joint Weakly Supervised AT and AED Using Deep Feature Distillation and Adaptive Focal Loss

by   Yunhao Liang, et al.

A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously. In this study, we propose three methods to improve the best teacher-student framework of DCASE2019 Task 4 for both AT and AED tasks. A frame-level target-events based deep feature distillation is first proposed, it aims to leverage the potential of limited strong-labeled data in weakly supervised framework to learn better intermediate feature maps. Then we propose an adaptive focal loss and two-stage training strategy to enable an effective and more accurate model training, in which the contribution of difficult-to-classify and easy-to-classify acoustic events to the total cost function can be automatically adjusted. Furthermore, an event-specific post processing is designed to improve the prediction of target event time-stamps. Our experiments are performed on the public DCASE2019 Task4 dataset, and results show that our approach achieves competitive performances in both AT (49.8



There are no comments yet.


page 1


A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling

This paper proposes a network architecture mainly designed for audio tag...

Joint Acoustic and Class Inference for Weakly Supervised Sound Event Detection

Sound event detection is a challenging task, especially for scenes with ...

CNN-based Discriminative Training for Domain Compensation in Acoustic Event Detection with Frame-wise Classifier

Domain mismatch is a noteworthy issue in acoustic event detection tasks,...

Semi-supervised Acoustic Event Detection based on tri-training

This paper presents our work of training acoustic event detection (AED) ...

Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data

In this paper we propose a novel learning framework called Supervised an...

Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering

Representations of events described in text are important for various ta...

Weakly Supervised Arrhythmia Detection Based on Deep Convolutional Neural Network

Supervised deep learning has been widely used in the studies of automati...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.