DeepAI AI Chat
Log In Sign Up

PodcastMix: A dataset for separating music and speech in podcasts

by   Nicolas Schmidt, et al.
Universitat Pompeu Fabra

We introduce PodcastMix, a dataset formalizing the task of separating background music and foreground speech in podcasts. We aim at defining a benchmark suitable for training and evaluating (deep learning) source separation models. To that end, we release a large and diverse training dataset based on programatically generated podcasts. However, current (deep learning) models can incur into generalization issues, specially when trained on synthetic data. To target potential generalization issues, we release an evaluation set based on real podcasts for which we design objective and subjective tests. Out of our experiments with real podcasts, we find that current (deep learning) models may have generalization issues. Yet, these can perform competently, e.g., our best baseline separates speech with a mean opinion score of 3.84 (rating "overall separation quality" from 1 to 5). The dataset and baselines are accessible online.


page 1

page 2

page 3

page 4


A Study of Transfer Learning in Music Source Separation

Supervised deep learning methods for performing audio source separation ...

Supervised Speech Separation Based on Deep Learning: An Overview

Speech separation is the task of separating target speech from backgroun...

Improving Choral Music Separation through Expressive Synthesized Data from Sampled Instruments

Choral music separation refers to the task of extracting tracks of voice...

Analyzing Images for Music Recommendation

Experiencing images with suitable music can greatly enrich the overall u...

Heterogeneous Target Speech Separation

We introduce a new paradigm for single-channel target source separation ...

Mixup-breakdown: a consistency training method for improving generalization of speech separation models

Deep-learning based speech separation models confront poor generalizatio...

Music Demixing Challenge 2021

Music source separation has been intensively studied in the last decade ...