Block-Online Guided Source Separation

11/16/2020
by   Shota Horiguchi, et al.
0

We propose a block-online algorithm of guided source separation (GSS). GSS is a speech separation method that uses diarization information to update parameters of the generative model of observation signals. Previous studies have shown that GSS performs well in multi-talker scenarios. However, it requires a large amount of calculation time, which is an obstacle to the deployment of online applications. It is also a problem that the offline GSS is an utterance-wise algorithm so that it produces latency according to the length of the utterance. With the proposed algorithm, block-wise input samples and corresponding time annotations are concatenated with those in the preceding context and used to update the parameters. Using the context enables the algorithm to estimate time-frequency masks accurately only from one iteration of optimization for each block, and its latency does not depend on the utterance length but predetermined block length. It also reduces calculation cost by updating only the parameters of active speakers in each block and its context. Evaluation on the CHiME-6 corpus and a meeting corpus showed that the proposed algorithm achieved almost the same performance as the conventional offline GSS algorithm but with 32x faster calculation, which is sufficient for real-time applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2019

Online Spectrogram Inversion for Low-Latency Audio Source Separation

Audio source separation is usually achieved by estimating the short-time...
research
12/28/2011

A general framework for online audio source separation

We consider the problem of online audio source separation. Existing algo...
research
11/01/2017

TasNet: time-domain audio separation network for real-time, single-channel speech separation

Robust speech processing in multi-talker environments requires effective...
research
10/22/2020

LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation

Recent deep-learning approaches have shown that Frequency Transformation...
research
07/31/2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones

A novel framework for meeting transcription using asynchronous microphon...
research
10/04/2020

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speaker Separation

We propose multi-microphone complex spectral mapping, a simple way of ap...
research
11/05/2020

BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers

We present a novel online end-to-end neural diarization system, BW-EDA-E...

Please sign up or login with your details

Forgot password? Click here to reset