DeepAI AI Chat
Log In Sign Up

Perceptive, non-linear Speech Processing and Spiking Neural Networks

by   Jean Rouat, et al.

Source separation and speech recognition are very difficult in the context of noisy and corrupted speech. Most conventional techniques need huge databases to estimate speech (or noise) density probabilities to perform separation or recognition. We discuss the potential of perceptive speech analysis and processing in combination with biologically plausible neural network processors. We illustrate the potential of such non-linear processing of speech on a source separation system inspired by an Auditory Scene Analysis paradigm. We also discuss a potential application in speech recognition.


On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

This paper introduces a new method for multi-channel time domain speech ...

Non-linear ICA based on Cramer-Wold metric

Non-linear source separation is a challenging open problem with many app...

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

Emulating the human ability to solve the cocktail party problem, i.e., f...

Independent Vector Analysis with Deep Neural Network Source Priors

This paper studies the density priors for independent vector analysis (I...

Breaking Speech Recognizers to Imagine Lyrics

We introduce a new method for generating text, and in particular song ly...

Biologically inspired speech emotion recognition

Conventional feature-based classification methods do not apply well to a...