Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems

03/27/2020
by   Akhil Mathur, et al.
0

Mobile and embedded devices are increasingly using microphones and audio-based computational models to infer user context. A major challenge in building systems that combine audio models with commodity microphones is to guarantee their accuracy and robustness in the real-world. Besides many environmental dynamics, a primary factor that impacts the robustness of audio models is microphone variability. In this work, we propose Mic2Mic – a machine-learned system component – which resides in the inference pipeline of audio models and at real-time reduces the variability in audio data caused by microphone-specific factors. Two key considerations for the design of Mic2Mic were: a) to decouple the problem of microphone variability from the audio task, and b) put a minimal burden on end-users to provide training data. With these in mind, we apply the principles of cycle-consistent generative adversarial networks (CycleGANs) to learn Mic2Mic using unlabeled and unpaired data collected from different microphones. Our experiments show that Mic2Mic can recover between 66 for two common audio tasks.

READ FULL TEXT

page 3

page 9

research
06/10/2023

Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks

Cycle-consistent generative adversarial networks have been widely used i...
research
03/20/2018

Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks

We present a novel approach to generating photo-realistic images of a fa...
research
05/09/2023

Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks

Gaps, dropouts and short clips of corrupted audio are a common problem a...
research
04/17/2021

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of envir...
research
09/05/2021

Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks

This research project investigates the application of deep learning to t...
research
10/25/2019

Learning Domain Invariant Representations for Child-Adult Classification from Speech

Diagnostic procedures for ASD (autism spectrum disorder) involve semi-na...
research
05/03/2021

An End-to-End and Accurate PPG-based Respiratory Rate Estimation Approach Using Cycle Generative Adversarial Networks

Respiratory rate (RR) is a clinical sign representing ventilation. An ab...

Please sign up or login with your details

Forgot password? Click here to reset