Ontology-aware Learning and Evaluation for Audio Tagging

11/22/2022
by   Haohe Liu, et al.
0

This study defines a new evaluation metric for audio tagging tasks to overcome the limitation of the conventional mean average precision (mAP) metric, which treats different kinds of sound as independent classes without considering their relations. Also, due to the ambiguities in sound labeling, the labels in the training and evaluation set are not guaranteed to be accurate and exhaustive, which poses challenges for robust evaluation with mAP. The proposed metric, ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation. Specifically, we reweight the false positive events in the model prediction based on the ontology graph distance to the target classes. The OmAP measure also provides more insights into model performance by evaluations with different coarse-grained levels in the ontology graph. We conduct human evaluations and demonstrate that OmAP is more consistent with human perception than mAP. To further verify the importance of utilizing the ontology information, we also propose a novel loss function (OBCE) that reweights binary cross entropy (BCE) loss based on the ontology distance. Our experiment shows that OBCE can improve both mAP and OmAP metrics on the AudioSet tagging task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2018

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

Audio tagging aims to predict one or several labels in an audio clip. Ma...
research
03/02/2019

Weakly Labelled AudioSet Tagging with Attention Neural Networks

Audio tagging is the task of predicting the presence or absence of sound...
research
07/21/2022

Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation

Evaluating the COCO mean average precision (MaP) and COCO recall metrics...
research
03/02/2019

Weakly labelled AudioSet Classification with Attention Neural Networks

Audio tagging is the task of predicting the presence or absence of sound...
research
05/24/2018

Coarse-to-fine Seam Estimation for Image Stitching

Seam-cutting and seam-driven techniques have been proven effective for h...
research
09/14/2022

Meta Pattern Concern Score: A Novel Metric for Customizable Evaluation of Multi-classification

Classifiers have been widely implemented in practice, while how to evalu...
research
10/03/2021

Enriching Ontology with Temporal Commonsense for Low-Resource Audio Tagging

Audio tagging aims at predicting sound events occurred in a recording. T...

Please sign up or login with your details

Forgot password? Click here to reset