Shennong: a Python toolbox for audio speech features extraction

12/10/2021
by   Mathieu Bernard, et al.
0

We introduce Shennong, a Python toolbox and command-line utility for speech features extraction. It implements a wide range of well-established state of art algorithms including spectro-temporal filters such as Mel-Frequency Cepstral Filterbanks or Predictive Linear Filters, pre-trained neural networks, pitch estimators as well as speaker normalization methods and post-processing algorithms. Shennong is an open source, easy-to-use, reliable and extensible framework. The use of Python makes the integration to others speech modeling and machine learning tools easy. It aims to replace or complement several heterogeneous software, such as Kaldi or Praat. After describing the Shennong software architecture, its core components and implemented algorithms, this paper illustrates its use on three applications: a comparison of speech features performances on a phones discrimination task, an analysis of a Vocal Tract Length Normalization model as a function of the speech duration used for training and a comparison of pitch estimation algorithms under various noise conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2018

SpeechPy - A Library for Speech Processing and Recognition

SpeechPy is an open source Python package that contains speech preproces...
research
05/20/2022

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddleSpeech is an open-source all-in-one speech toolkit. It aims at fac...
research
11/04/2019

pyannote.audio: neural building blocks for speaker diarization

We introduce pyannote.audio, an open-source toolkit written in Python fo...
research
06/08/2021

SpeechBrain: A General-Purpose Speech Toolkit

SpeechBrain is an open-source and all-in-one speech toolkit. It is desig...
research
05/18/2020

Surfboard: Audio Feature Extraction for Modern Machine Learning

We introduce Surfboard, an open-source Python library for extracting aud...
research
11/19/2018

The PyTorch-Kaldi Speech Recognition Toolkit

The availability of open-source software is playing a remarkable role in...
research
04/17/2023

Fast Random Approximation of Multi-channel Room Impulse Response

Modern neural-network-based speech processing systems are typically requ...

Please sign up or login with your details

Forgot password? Click here to reset