Music source separation (MSS) aims to separate a music recording into
mu...
The front-end is a critical component of English text-to-speech (TTS)
sy...
Universal source separation (USS) is a fundamental research task for
com...
The advancement of audio-language (AL) multimodal learning tasks has bee...
In this paper, we introduce Jointist, an instrument-aware multi-instrume...
This study defines a new evaluation metric for audio tagging tasks to
ov...
Binaural rendering of ambisonic signals is of broad interest to virtual
...
Audio captioning is the task of generating captions that describe the co...
Sound field decomposition predicts waveforms in arbitrary directions usi...
The audio spectrogram is a time-frequency representation that has been w...
Recently, there has been increasing interest in building efficient audio...
Sound event localization and detection (SELD) is a joint task of sound e...
Few-shot audio event detection is a task that detects the occurrence tim...
Few-shot bioacoustic event detection is a task that detects the occurren...
In this paper, we introduce Jointist, an instrument-aware multi-instrume...
Speech restoration aims to remove distortions in speech signals. Prior
m...
In this paper, we introduce the task of language-queried audio source
se...
Speech super-resolution (SR) is a task to increase speech sampling rate ...
Polyphonic sound event localization and detection (SELD) aims at detecti...
Music source separation (MSS) shows active progress with deep learning m...
Speech restoration aims to remove distortions in speech signals. Prior
m...
Deep neural network based methods have been successfully applied to musi...
We propose a unified model for three inter-related tasks: 1) to
separate...
This paper proposes a deep learning framework for classification of BBC
...
Speech enhancement aims to obtain speech signals with high intelligibili...
Speech enhancement is a task to improve the intelligibility and perceptu...
Music source separation (MSS) is the task of separating a music piece in...
Acoustic Scene Classification (ASC) aims to classify the environment in ...
Music classification is a task to classify a music piece into labels suc...
Polyphonic sound event localization and detection (SELD), which jointly
...
Symbolic music datasets are important for music information retrieval an...
Automatic music transcription (AMT) is the task of transcribing audio
re...
Polyphonic sound event localization and detection is not only detecting ...
This paper presents a Depthwise Disout Convolutional Neural Network (DD-...
High quality labeled datasets have allowed deep learning to achieve
impr...
In supervised machine learning, the assumption that training data is lab...
Source separation is the task to separate an audio recording into indivi...
Audio pattern recognition is an important research topic in the machine
...
Sound event detection (SED) is a task to detect sound events in an audio...
The availability of large-scale household energy consumption datasets bo...
Single-channel signal separation and deconvolution aims to separate and
...
Sound event detection (SED) and localization refer to recognizing sound
...
Sound event detection (SED) methods typically rely on either strongly
la...
The Detection and Classification of Acoustic Scenes and Events (DCASE) 2...
The Detection and Classification of Acoustic Scenes and Events (DCASE) 2...
Audio tagging is the task of predicting the presence or absence of sound...
Audio tagging is the task of predicting the presence or absence of sound...
Audio tagging aims to detect the types of sound events occurring in an a...
Sound event detection (SED) is typically posed as a supervised learning
...
Audio tagging aims to infer descriptive labels from audio clips. Audio
t...