UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

08/14/2023
by   Weijian Mai, et al.
0

Image reconstruction and captioning from brain activity evoked by visual stimuli allow researchers to further understand the connection between the human brain and the visual perception system. While deep generative models have recently been employed in this field, reconstructing realistic captions and images with both low-level details and high semantic fidelity is still a challenging problem. In this work, we propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity. For the first time, we unify image reconstruction and captioning from visual-evoked functional magnetic resonance imaging (fMRI) through a latent diffusion model termed Versatile Diffusion. Specifically, we transform fMRI voxels into text and image latent for low-level information and guide the backward diffusion process through fMRI-based image and text conditions derived from CLIP to generate realistic captions and images. UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes Dataset (NSD) dataset. Moreover, the ablation experiments and functional region-of-interest (ROI) analysis further exhibit the superiority of UniBrain and provide comprehensive insight for visual-evoked brain decoding.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 8

page 9

research
03/24/2023

MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion

Reconstructing visual stimuli from measured functional magnetic resonanc...
research
06/16/2023

DreamCatcher: Revealing the Language of the Brain with fMRI using GPT Embedding

The human brain possesses remarkable abilities in visual processing, inc...
research
03/09/2023

Brain-Diffuser: Natural scene reconstruction from fMRI signals using generative latent diffusion

In neural decoding research, one of the most intriguing topics is the re...
research
04/25/2017

Sharing deep generative representation for perceived image reconstruction from human brain activity

Decoding human brain activities via functional magnetic resonance imagin...
research
07/27/2023

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals

Seeing is believing, however, the underlying mechanism of how human visu...
research
05/17/2023

Controllable Mind Visual Diffusion Model

Brain signal visualization has emerged as an active research area, servi...

Please sign up or login with your details

Forgot password? Click here to reset