Zero-shot Sound Event Classification Using a Sound Attribute Vector with Global and Local Feature Learning

03/18/2023
by   Yi-Han Lin, et al.
0

This paper introduces a zero-shot sound event classification (ZS-SEC) method to identify sound events that have never occurred in training data. In our previous work, we proposed a ZS-SEC method using sound attribute vectors (SAVs), where a deep neural network model infers attribute information that describes the sound of an event class instead of inferring its class label directly. Our previous method showed that it could classify unseen events to some extent; however, the accuracy for unseen events was far inferior to that for seen events. In this paper, we propose a new ZS-SEC method that can learn discriminative global features and local features simultaneously to enhance SAV-based ZS-SEC. In the proposed method, while the global features are learned in order to discriminate the event classes in the training data, the spectro-temporal local features are learned in order to regress the attribute information using attribute prototypes. The experimental results show that our proposed method can improve the accuracy of SAV-based ZS-SEC and can visualize the region in the spectrogram related to each attribute.

READ FULL TEXT
research
10/14/2021

Region Semantically Aligned Network for Zero-Shot Learning

Zero-shot learning (ZSL) aims to recognize unseen classes based on the k...
research
02/23/2021

Improving Deep Learning Sound Events Classifiers using Gram Matrix Feature-wise Correlations

In this paper, we propose a new Sound Event Classification (SEC) method ...
research
04/13/2022

Sound Event Triage: Detecting Sound Events Considering Priority of Classes

We propose a new task for sound event detection (SED): sound event triag...
research
03/29/2022

Hybrid Routing Transformer for Zero-Shot Learning

Zero-shot learning (ZSL) aims to learn models that can recognize unseen ...
research
02/02/2023

Vision Transformer-based Feature Extraction for Generalized Zero-Shot Learning

Generalized zero-shot learning (GZSL) is a technique to train a deep lea...
research
12/04/2018

Learning to match transient sound events using attentional similarity for few-shot sound recognition

In this paper, we introduce a novel attentional similarity module for th...
research
10/27/2019

Sound Event Recognition in a Smart City Surveillance Context

Due to the growing demand for improving surveillance capabilities in sma...

Please sign up or login with your details

Forgot password? Click here to reset