Heterogeneous Graph Learning for Acoustic Event Classification

03/05/2023
by   Amir Shirian, et al.
0

Heterogeneous graphs provide a compact, efficient, and scalable way to model data involving multiple disparate modalities. This makes modeling audiovisual data using heterogeneous graphs an attractive option. However, graph structure does not appear naturally in audiovisual data. Graphs for audiovisual data are constructed manually which is both difficult and sub-optimal. In this work, we address this problem by (i) proposing a parametric graph construction strategy for the intra-modal edges, and (ii) learning the crossmodal edges. To this end, we develop a new model, heterogeneous graph crossmodal network (HGCN) that learns the crossmodal edges. Our proposed model can adapt to various spatial and temporal scales owing to its parametric construction, while the learnable crossmodal edges effectively connect the relevant nodes across modalities. Experiments on a large benchmark dataset (AudioSet) show that our model is state-of-the-art (0.53 mean average precision), outperforming transformer-based models and other graph-based models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2022

Visually-aware Acoustic Event Detection using Heterogeneous Graphs

Perception of auditory events is inherently multimodal relying on both a...
research
03/03/2020

Heterogeneous Graph Transformer

Recent years have witnessed the emerging success of graph neural network...
research
10/16/2021

A Heterogeneous Graph Based Framework for Multimodal Neuroimaging Fusion Learning

Here, we present a Heterogeneous Graph neural network for Multimodal neu...
research
03/15/2022

PDNS-Net: A Large Heterogeneous Graph Benchmark Dataset of Network Resolutions for Graph Learning

In order to advance the state of the art in graph learning algorithms, i...
research
12/02/2021

Learning Spatial-Temporal Graphs for Active Speaker Detection

We address the problem of active speaker detection through a new framewo...
research
04/30/2021

GTN-ED: Event Detection Using Graph Transformer Networks

Recent works show that the graph structure of sentences, generated from ...
research
04/14/2023

The Deep Latent Position Topic Model for Clustering and Representation of Networks with Textual Edges

Numerical interactions leading to users sharing textual content publishe...

Please sign up or login with your details

Forgot password? Click here to reset