Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection

04/02/2020
by   Tharindu Fernando, et al.
0

This paper presents a novel framework for Speech Activity Detection (SAD). Inspired by the recent success of multi-task learning approaches in the speech processing domain, we propose a novel joint learning framework for SAD. We utilise generative adversarial networks to automatically learn a loss function for joint prediction of the frame-wise speech/ non-speech classifications together with the next audio segment. In order to exploit the temporal relationships within the input signal, we propose a temporal discriminator which aims to ensure that the predicted signal is temporally consistent. We evaluate the proposed framework on multiple public benchmarks, including NIST OpenSAT' 17, AMI Meeting and HAVIC, where we demonstrate its capability to outperform state-of-the-art SAD approaches. Furthermore, our cross-database evaluations demonstrate the robustness of the proposed approach across different languages, accents, and acoustic environments.

READ FULL TEXT
research
07/06/2017

Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework

In this paper, we aim at improving the performance of synthesized speech...
research
12/18/2018

GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds

This paper presents a novel deep learning framework for human trajectory...
research
04/17/2021

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of envir...
research
03/02/2021

Long-Running Speech Recognizer:An End-to-End Multi-Task Learning Framework for Online ASR and VAD

When we use End-to-end automatic speech recognition (E2E-ASR) system for...
research
05/13/2018

Learning Temporal Strategic Relationships using Generative Adversarial Imitation Learning

This paper presents a novel framework for automatic learning of complex ...
research
11/05/2017

Robust Speech Recognition Using Generative Adversarial Networks

This paper describes a general, scalable, end-to-end framework that uses...
research
03/09/2018

Task Specific Visual Saliency Prediction with Memory Augmented Conditional Generative Adversarial Networks

Visual saliency patterns are the result of a variety of factors aside fr...

Please sign up or login with your details

Forgot password? Click here to reset