Listen, Read, and Identify: Multimodal Singing Language Identification of Music

03/02/2021
by   Keunwoo Choi, et al.
0

We propose a multimodal singing language classification model that uses both audio content and textual metadata. LRID-Net, the proposed model, takes an audio signal and a language probability vector estimated from the metadata and outputs the probabilities of the target languages. Optionally, LRID-Net is facilitated with modality dropouts to handle a missing modality. In the experiment, we trained several LRID-Nets with varying modality dropout configuration and test them with various combinations of input modalities. The experiment results demonstrate that using multimodal input improves the performance. The results also suggest that adopting modality dropout does not degrade performance of the model when there are full modality inputs while enabling the model to handle missing modality cases to some extent.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 5

08/21/2018

LRMM: Learning to Recommend with Missing Modalities

Multimodal learning has shown promising performance in content-based rec...
10/19/2019

Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos

We present an audio-visual multimodal approach for the task of zeroshot ...
11/13/2015

Symbol Grounding Association in Multimodal Sequences with Missing Elements

In this paper, we extend a symbolic association framework for being able...
03/06/2022

Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos

With the assumption that a video dataset is multimodality annotated in w...
08/19/2019

A unified representation network for segmentation with missing modalities

Over the last few years machine learning has demonstrated groundbreaking...
10/02/2020

Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition

Automatic audio-visual expression recognition can play an important role...
03/12/2021

Orthogonal Statistical Inference for Multimodal Data Analysis

Multimodal imaging has transformed neuroscience research. While it prese...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.