DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input

08/22/2022
by   Jun Rekimoto, et al.
7

Interactions based on automatic speech recognition (ASR) have become widely used, with speech input being increasingly utilized to create documents. However, as there is no easy way to distinguish between commands being issued and text required to be input in speech, misrecognitions are difficult to identify and correct, meaning that documents need to be manually edited and corrected. The input of symbols and commands is also challenging because these may be misrecognized as text letters. To address these problems, this study proposes a speech interaction method called DualVoice, by which commands can be input in a whispered voice and letters in a normal voice. The proposed method does not require any specialized hardware other than a regular microphone, enabling a complete hands-free interaction. The method can be used in a wide range of situations where speech recognition is already available, ranging from text input to mobile/wearable computing. Two neural networks were designed in this study, one for discriminating normal speech from whispered speech, and the second for recognizing whisper speech. A prototype of a text input system was then developed to show how normal and whispered voice can be used in speech text input. Other potential applications using DualVoice are also discussed.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 9

research
04/19/2022

Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking

The development of deep learning technology has greatly promoted the per...
research
11/05/2022

Evaluation of Automated Speech Recognition Systems for Conversational Speech: A Linguistic Perspective

Automatic speech recognition (ASR) meets more informal and free-form inp...
research
08/10/2021

StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition

Preserving the linguistic content of input speech is essential during vo...
research
07/30/2021

The History of Speech Recognition to the Year 2030

The decade from 2010 to 2020 saw remarkable improvements in automatic sp...
research
02/27/2023

Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model

Automatic Speech Recognition (ASR) is a technology that converts spoken ...
research
03/17/2022

Robust and Complex Approach of Pathological Speech Signal Analysis

This paper presents a study of the approaches in the state-of-the-art in...
research
04/12/2022

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction

This study extends our previous work on text-based speech editing to dev...

Please sign up or login with your details

Forgot password? Click here to reset