A.I. based Embedded Speech to Text Using Deepspeech

Deepspeech was very useful for development IoT devices that need voice recognition. One of the voice recognition systems is deepspeech from Mozilla. Deepspeech is an open-source voice recognition that was using a neural network to convert speech spectrogram into a text transcript. This paper shows the implementation process of speech recognition on a low-end computational device. Development of English-language speech recognition that has many datasets become a good point for starting. The model that used results from pre-trained model that provide by each version of deepspeech, without change of the model that already released, furthermore the benefit of using raspberry pi as a media end-to-end speech recognition device become a good thing, user can change and modify of the speech recognition, and also deepspeech can be standalone device without need continuously internet connection to process speech recognition, and even this paper show the power of Tensorflow Lite can make a significant difference on inference by deepspeech rather than using Tensorflow non-Lite.This paper shows the experiment using Deepspeech version 0.1.0, 0.1.1, and 0.6.0, and there is some improvement on Deepspeech version 0.6.0, faster while processing speech-to-text on old hardware raspberry pi 3 b+.

READ FULL TEXT
research
12/14/2020

A review of on-device fully neural end-to-end automatic speech recognition algorithms

In this paper, we review various end-to-end automatic speech recognition...
research
12/22/2021

VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation

Most of the speech recognition systems recover only words in the speech ...
research
06/07/2023

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

This work introduces Zambezi Voice, an open-source multilingual speech r...
research
10/03/2019

Convolutional Neural Networks for Speech Controlled Prosthetic Hands

Speech recognition is one of the key topics in artificial intelligence, ...
research
05/04/2022

Design of a novel Korean learning application for efficient pronunciation correction

The Korean wave, which denotes the global popularity of South Korea's cu...
research
09/22/2020

End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands

Speech is one of the most common forms of communication in humans. Speec...
research
07/23/2022

Implementation Of Tiny Machine Learning Models On Arduino 33 BLE For Gesture And Speech Recognition

In this article gesture recognition and speech recognition applications ...

Please sign up or login with your details

Forgot password? Click here to reset