End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands

09/22/2020
by   Mohsen Jafarzadeh, et al.
0

Speech is one of the most common forms of communication in humans. Speech commands are essential parts of multimodal controlling of prosthetic hands. In the past decades, researchers used automatic speech recognition systems for controlling prosthetic hands by using speech commands. Automatic speech recognition systems learn how to map human speech to text. Then, they used natural language processing or a look-up table to map the estimated text to a trajectory. However, the performance of conventional speech-controlled prosthetic hands is still unsatisfactory. Recent advancements in general-purpose graphics processing units (GPGPUs) enable intelligent devices to run deep neural networks in real-time. Thus, architectures of intelligent systems have rapidly transformed from the paradigm of composite subsystems optimization to the paradigm of end-to-end optimization. In this paper, we propose an end-to-end convolutional neural network (CNN) that maps speech 2D features directly to trajectories for prosthetic hands. The proposed convolutional neural network is lightweight, and thus it runs in real-time in an embedded GPGPU. The proposed method can use any type of speech 2D feature that has local correlations in each dimension such as spectrogram, MFCC, or PNCC. We omit the speech to text step in controlling the prosthetic hand in this paper. The network is written in Python with Keras library that has a TensorFlow backend. We optimized the CNN for NVIDIA Jetson TX2 developer kit. Our experiment on this CNN demonstrates a root-mean-square error of 0.119 and 20ms running time to produce trajectory outputs corresponding to the voice input data. To achieve a lower error in real-time, we can optimize a similar CNN for a more powerful embedded GPGPU such as NVIDIA AGX Xavier.

READ FULL TEXT
research
10/03/2019

Convolutional Neural Networks for Speech Controlled Prosthetic Hands

Speech recognition is one of the key topics in artificial intelligence, ...
research
04/30/2020

A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications

Auditory models are commonly used as feature extractors for automatic sp...
research
09/21/2019

Deep learning approach to control of prosthetic hands with electromyography signals

Natural muscles provide mobility in response to nerve impulses. Electrom...
research
11/07/2018

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

Casual conversations involving multiple speakers and noises from surroun...
research
02/25/2020

A.I. based Embedded Speech to Text Using Deepspeech

Deepspeech was very useful for development IoT devices that need voice r...
research
06/27/2018

LPRNet: License Plate Recognition via Deep Neural Networks

This paper proposes LPRNet - end-to-end method for Automatic License Pla...
research
01/29/2021

BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge

This paper describes joint effort of BUT and Telefónica Research on deve...

Please sign up or login with your details

Forgot password? Click here to reset