End-to-end translation of human neural activity to speech with a dual-dual generative adversarial network

10/13/2021
by   Yina Guo, et al.
0

In a recent study of auditory evoked potential (AEP) based brain-computer interface (BCI), it was shown that, with an encoder-decoder framework, it is possible to translate human neural activity to speech (T-CAS). However, current encoder-decoder-based methods achieve T-CAS often with a two-step method where the information is passed between the encoder and decoder with a shared dimension reduction vector, which may result in a loss of information. A potential approach to this problem is to design an end-to-end method by using a dual generative adversarial network (DualGAN) without dimension reduction of passing information, but it cannot realize one-to-one signal-to-signal translation (see Fig.1 (a) and (b)). In this paper, we propose an end-to-end model to translate human neural activity to speech directly, create a new electroencephalogram (EEG) datasets for participants with good attention by design a device to detect participants' attention, and introduce a dual-dual generative adversarial network (Dual-DualGAN) (see Fig. 1 (c) and (d)) to address an end-to-end translation of human neural activity to speech (ET-CAS) problem by group labelling EEG signals and speech signals, inserting a transition domain to realize cross-domain mapping. In the transition domain, the transition signals are cascaded by the corresponding EEG and speech signals in a certain proportion, which can build bridges for EEG and speech signals without corresponding features, and realize one-to-one cross-domain EEG-to-speech translation. The proposed method can translate word-length and sentence-length sequences of neural activity to speech. Experimental evaluation has been conducted to show that the proposed method significantly outperforms state-of-the-art methods on both words and sentences of auditory stimulus.

READ FULL TEXT

page 5

page 8

page 10

page 11

research
10/28/2020

Bridging the Modality Gap for Speech-to-Text Translation

End-to-end speech translation aims to translate speech in one language i...
research
04/07/2020

Direct Speech-to-image Translation

Direct speech-to-image translation without text is an interesting and us...
research
01/02/2023

Towards Voice Reconstruction from EEG during Imagined Speech

Translating imagined speech from human brain activity into voice is a ch...
research
10/25/2020

Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder

Fast inference speed is an important goal towards real-world deployment ...
research
10/13/2021

One to Multiple Mapping Dual Learning: Learning Multiple Sources from One Mixed Signal

Single channel blind source separation (SCBSS) refers to separate multip...
research
09/03/2020

ThoughtViz: Visualizing Human Thoughts Using Generative Adversarial Network

Studying human brain signals has always gathered great attention from th...
research
03/29/2021

Product semantics translation from brain activity via adversarial learning

A small change of design semantics may affect a user's satisfaction with...

Please sign up or login with your details

Forgot password? Click here to reset