Onssen: an open-source speech separation and enhancement library

11/03/2019
by   Zhaoheng Ni, et al.
0

Speech separation is an essential task for multi-talker speech recognition. Recently many deep learning approaches are proposed and have been constantly refreshing the state-of-the-art performances. The lack of algorithm implementations limits researchers to use the same dataset for comparison. Building a generic platform can benefit researchers by easily implementing novel separation algorithms and comparing them with the existing ones on customized datasets. We introduce "onssen": an open-source speech separation and enhancement library. onssen is a library mainly for deep learning separation and enhancement algorithms. It uses LibRosa and NumPy libraries for the feature extraction and PyTorch as the back-end for model training. onssen supports most of the Time-Frequency mask-based separation algorithms (e.g. deep clustering, chimera net, chimera++, and so on) and also supports customized datasets. In this paper, we describe the functionality of modules in onssen and show the algorithms implemented by onssen achieve the same performances as reported in the original papers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2020

ESPnet-se: end-to-end speech enhancement and separation toolkit designed for asr integration

We present ESPnet-SE, which is designed for the quick development of spe...
research
12/18/2018

wav2letter++: The Fastest Open-source Speech Recognition System

This paper introduces wav2letter++, the fastest open-source deep learnin...
research
04/16/2019

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering

Speech separation has been very successful with deep learning techniques...
research
11/04/2020

DESNet: A Multi-channel Network for Simultaneous Speech Dereverberation, Enhancement and Separation

In this paper, we propose a multi-channel network for simultaneous speec...
research
05/08/2020

Neural Spatio-Temporal Beamformer for Target Speech Separation

Purely neural network (NN) based speech separation and enhancement metho...
research
11/29/2019

Improving Voice Separation by Incorporating End-to-end Speech Recognition

Despite recent advances in voice separation methods, many challenges rem...
research
10/17/2022

TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library

The DIVA model is a computational model of speech motor control that com...

Please sign up or login with your details

Forgot password? Click here to reset