CNN-based Discriminative Training for Domain Compensation in Acoustic Event Detection with Frame-wise Classifier

by   Tiantian Tang, et al.

Domain mismatch is a noteworthy issue in acoustic event detection tasks, as the target domain data is difficult to access in most real applications. In this study, we propose a novel CNN-based discriminative training framework as a domain compensation method to handle this issue. It uses a parallel CNN-based discriminator to learn a pair of high-level intermediate acoustic representations. Together with a binary discriminative loss, the discriminators are forced to maximally exploit the discrimination of heterogeneous acoustic information in each audio clip with target events, which results in a robust paired representations that can well discriminate the target events and background/domain variations separately. Moreover, to better learn the transient characteristics of target events, a frame-wise classifier is designed to perform the final classification. In addition, a two-stage training with the CNN-based discriminator initialization is further proposed to enhance the system training. All experiments are performed on the DCASE 2018 Task3 datasets. Results show that our proposal significantly outperforms the official baseline on cross-domain conditions in AUC by relative 1.8-12.1 performance degradation on in-domain evaluation conditions.



There are no comments yet.


page 4


Domain Mismatch Robust Acoustic Scene Classification using Channel Information Conversion

In a recent acoustic scene classification (ASC) research field, training...

Joint Weakly Supervised AT and AED Using Deep Feature Distillation and Adaptive Focal Loss

A good joint training framework is very helpful to improve the performan...

Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification

It is well known that the mismatch between training (source) and test (t...

Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification

Forensic audio analysis for speaker verification offers unique challenge...

The impact of non-target events in synthetic soundscapes for sound event detection

Detection and Classification Acoustic Scene and Events Challenge 2021 Ta...

Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

Distribution mismatches between the data seen at training and at applica...

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Detection of common events and scenes from audio is useful for extractin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.