Improving the transferability of speech separation by meta-learning

03/11/2022
by   Kuan-Po Huang, et al.
0

Speech separation aims to separate multiple speech sources from a speech mixture. Although speech separation is well-solved on some existing English speech separation benchmarks, it is worthy of more investigation on the generalizability of speech separation models on the accents or languages unseen during training. This paper adopts meta-learning based methods to improve the transferability of speech separation models. With the meta-learning based methods, we discovered that only using speech data with one accent, the native English accent, as our training data, the models still can be adapted to new unseen accents on the Speech Accent Archive. We compared the results with a human-rated native-likeness of accents, showing that the transferability of MAML methods has less relation to the similarity of data between the training and testing phase compared to the typical transfer learning methods. Furthermore, we found that models can deal with different language data from the CommonVoice corpus during the testing phase. Most of all, the MAML methods outperform typical transfer learning methods when it comes to new accents, new speakers, new languages, and noisy environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2020

One Shot Learning for Speech Separation

Despite the recent success of speech separation models, they fail to sep...
research
06/22/2021

Multi-accent Speech Separation with One Shot Learning

Speech separation is a problem in the field of speech processing that ha...
research
09/25/2018

Non-native children speech recognition through transfer learning

This work deals with non-native children's speech and investigates both ...
research
05/23/2020

Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation

Although deep-learning-based methods have markedly improved the performa...
research
03/29/2020

When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey

With widespread applications of artificial intelligence (AI), the capabi...
research
03/07/2022

Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features

While neural text-to-speech systems perform remarkably well in high-reso...
research
12/19/2019

Practical applicability of deep neural networks for overlapping speaker separation

This paper examines the applicability in realistic scenarios of two deep...

Please sign up or login with your details

Forgot password? Click here to reset