WEnets: A Convolutional Framework for Evaluating Audio Waveforms

09/19/2019
by   Andrew A. Catellier, et al.
0

We describe a new convolutional framework for waveform evaluation, WEnets, and build a Narrowband Audio Waveform Evaluation Network, or NAWEnet, using this framework. NAWEnet is single-ended (or no-reference) and was trained three separate times in order to emulate PESQ, POLQA, or STOI with testing correlations 0.95, 0.92, and 0.95, respectively when training on only 50 available data and testing on 40 non-linear downsampling learn which features are important for quality or intelligibility estimation. This straightforward architecture simplifies the interpretation of its inner workings and paves the way for future investigations into higher sample rates and accurate no-reference subjective speech quality predictions.

READ FULL TEXT

page 5

page 6

research
06/27/2022

Wideband Audio Waveform Evaluation Networks: Efficient, Accurate Estimation of Speech Qualities

Wideband Audio Waveform Evaluation Networks (WAWEnets) are convolutional...
research
04/25/2021

Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis

Speech synthesis and music audio generation from symbolic input differ i...
research
11/10/2020

Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model

High-quality speech corpora are essential foundations for most speech ap...
research
11/15/2018

Comprehensive evaluation of statistical speech waveform synthesis

Statistical TTS systems that directly predict the speech waveform have r...
research
09/07/2020

Deep Learning-Based Single-Ended Objective Quality Measures for Time-Scale Modified Audio

Objective evaluation of audio processed with Time-Scale Modification (TS...
research
08/20/2018

Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks

We propose the multi-head convolutional neural network (MCNN) architectu...
research
10/11/2021

LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example

Emotional and controllable speech synthesis is a topic that has received...

Please sign up or login with your details

Forgot password? Click here to reset