CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

08/21/2020
by   Alexander Chao-Fu Kang, et al.
0

In this paper, we present a deep learning-based speech signal-processing mobile application, CITISEN, which can perform three functions: speech enhancement (SE), acoustic scene conversion (ASC), and model adaptation (MA). For SE, CITISEN can effectively reduce noise components from speech signals and accordingly enhance their clarity and intelligibility. For ASC, CITISEN can convert the current background sound to a different background sound. Finally, for MA, CITISEN can effectively adapt an SE model, with a few audio files, when it encounters unknown speakers or noise types; the adapted SE model is used to enhance the upcoming noisy utterances. Experimental results confirmed the effectiveness of CITISEN in performing these three functions via objective evaluation and subjective listening tests. The promising results reveal that the developed CITISEN mobile application can potentially be used as a front-end processor for various speech-related services such as voice communication, assistive hearing devices, and virtual reality headsets.

READ FULL TEXT

page 4

page 5

page 7

research
10/19/2021

Speech Enhancement-assisted Stargan Voice Conversion in Noisy Environments

Numerous voice conversion (VC) techniques have been proposed for the con...
research
04/13/2023

The future of hearing aid technology

Background. Hearing aid technology has proven successful in the rehabili...
research
02/20/2023

Improving Speech Enhancement via Event-based Query

Existing deep learning based speech enhancement (SE) methods either use ...
research
11/15/2018

On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement

Audio-visual speech enhancement (AV-SE) is the task of improving speech ...
research
11/10/2021

OSSEM: one-shot speaker adaptive speech enhancement using meta learning

Although deep learning (DL) has achieved notable progress in speech enha...
research
11/25/2022

Stereo Speech Enhancement Using Custom Mid-Side Signals and Monaural Processing

Speech Enhancement (SE) systems typically operate on monaural input and ...
research
04/12/2021

L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

The L3DAS21 Challenge is aimed at encouraging and fostering collaborativ...

Please sign up or login with your details

Forgot password? Click here to reset