Large Language Models (LLMs) have shown immense potential in multimodal
...
Recent developments in MIR have led to several benchmark deep learning m...
Self-supervised learning (SSL) has shown promising results in various sp...
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic
...
In the era of extensive intersection between art and Artificial Intellig...
Self-supervised learning (SSL) has recently emerged as a promising parad...
The deep learning community has witnessed an exponentially growing inter...
Learning music representations that are general-purpose offers the
flexi...
As one of the most intuitive interfaces known to humans, natural languag...
Loss-gradients are used to interpret the decision making process of deep...
Imitating musical instruments with the human voice is an efficient way o...
Most recent research about automatic music transcription (AMT) uses
conv...
In recent years, the accuracy of automatic lyrics alignment methods has
...
Audio representations for music information retrieval are typically lear...
Sound scene geotagging is a new topic of research which has evolved from...
Animal vocalisations contain important information about health, emotion...
Non-intrusive speech quality assessment is a crucial operation in multim...
This paper proposes a deep convolutional neural network for performing
n...
Content-based music information retrieval has seen rapid progress with t...
Recent advances in automatic music transcription (AMT) have achieved hig...
This paper addresses the problem of domain adaptation for the task of mu...
Most of the state-of-the-art automatic music transcription (AMT) models ...
The Automatic Speaker Verification Spoofing and Countermeasures Challeng...
One way to analyse the behaviour of machine learning models is through l...
In this paper we investigate the importance of the extent of memory in
s...
This technical report gives a detailed, formal description of the featur...
Audio impairment recognition is based on finding noise in audio files an...
Plate and spring reverberators are electromechanical systems first used ...
Polyphonic Sound Event Detection (SED) in real-world recordings is a
cha...
Adversarial attacks refer to a set of methods that perturb the input to ...
Audio processors whose parameters are modified periodically over time ar...
In this paper we propose an efficient deep learning encoder-decoder netw...
The majority of sound scene analysis work focuses on one of two clearly
...
Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are ...
One way to interpret trained deep neural networks (DNNs) is by inspectin...
Detecting spoofing attempts of automatic speaker verification (ASV) syst...
The absence of the queen in a beehive is a very strong indicator of the ...
In this work, we aim to explore the potential of machine learning method...
Acoustic Scene Classification (ASC) is one of the core research problems...
We present a new extensible and divisible taxonomy for open set sound sc...
The second Automatic Speaker Verification Spoofing and Countermeasures
c...
In this paper, we show empirical evidence on how to construct the optima...
As part of the 2016 public evaluation challenge on Detection and
Classif...
We present a supervised neural network model for polyphonic piano music
...
This paper introduces a model of environmental acoustic scenes which ado...