Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices

06/20/2016
by   Heiga Zen, et al.
0

Acoustic models based on long short-term memory recurrent neural networks (LSTM-RNNs) were applied to statistical parametric speech synthesis (SPSS) and showed significant improvements in naturalness and latency over those based on hidden Markov models (HMMs). This paper describes further optimizations of LSTM-RNN-based SPSS for deployment on mobile devices; weight quantization, multi-frame inference, and robust inference using an ϵ-contaminated Gaussian loss function. Experimental results in subjective listening tests show that these optimizations can make LSTM-RNN-based SPSS comparable to HMM-based SPSS in runtime speed while maintaining naturalness. Evaluations between LSTM-RNN- based SPSS and HMM-driven unit selection speech synthesis are also presented.

READ FULL TEXT

page 2

page 4

research
07/24/2015

Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition

We have recently shown that deep Long Short-Term Memory (LSTM) recurrent...
research
03/25/2016

On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition

We study the problem of compressing recurrent neural networks (RNNs). In...
research
04/12/2019

RNN-based speech synthesis using a continuous sinusoidal model

Recently in statistical parametric speech synthesis, we proposed a conti...
research
06/03/2017

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

In this paper, we explore optimizations to run Recurrent Neural Network ...
research
09/26/2019

Optimizing Speech Recognition For The Edge

While most deployed speech recognition systems today still run on server...
research
11/28/2018

UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster

Neural networks with Auto-regressive structures, such as Recurrent Neura...
research
03/08/2022

Practical cognitive speech compression

This paper presents a new neural speech compression method that is pract...

Please sign up or login with your details

Forgot password? Click here to reset