Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling

08/12/2020
by   Haiwei Wu, et al.
0

This paper introduces our approaches for the Mask and Breathing Sub-Challenge in the Interspeech COMPARE Challenge 2020. For the mask detection task, we train deep convolutional neural networks with filter-bank energies, gender-aware features, and speaker-aware features. Support Vector Machines follows as the back-end classifiers for binary prediction on the extracted deep embeddings. Several data augmentation schemes are used to increase the quantity of training data and improve our models' robustness, including speed perturbation, SpecAugment, and random erasing. For the speech breath monitoring task, we investigate different bottleneck features based on the Bi-LSTM structure. Experimental results show that our proposed methods outperform the baselines and achieve 0.746 PCC and 78.8 evaluation set, respectively.

READ FULL TEXT
research
08/11/2020

Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms

In many fields of research, labeled datasets are hard to acquire. This i...
research
06/17/2020

Are you wearing a mask? Improving mask detection from speech using augmentation by cycle-consistent GANs

The task of detecting whether a person wears a face mask from speech is ...
research
01/14/2022

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

Disordered speech recognition is a highly challenging task. The underlyi...
research
08/06/2020

Aalto's End-to-End DNN systems for the INTERSPEECH 2020 Computational Paralinguistics Challenge

End-to-end neural network models (E2E) have shown significant performanc...
research
12/19/2019

LSTM-TDNN with convolutional front-end for Dialect Identification in the 2019 Multi-Genre Broadcast Challenge

This paper presents a novel Dialect Identification (DID) system develope...
research
07/05/2019

The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion

This paper describes our DKU replay detection system for the ASVspoof 20...

Please sign up or login with your details

Forgot password? Click here to reset