Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

11/13/2022
by   Zijiao Chen, et al.
0

Decoding visual stimuli from brain recordings aims to deepen our understanding of the human visual system and build a solid foundation for bridging human and computer vision through the Brain-Computer Interface. However, reconstructing high-quality images with correct semantics from brain recordings is a challenging problem due to the complex underlying representations of brain signals and the scarcity of data annotations. In this work, we present MinD-Vis: Sparse Masked Brain Modeling with Double-Conditioned Latent Diffusion Model for Human Vision Decoding. Firstly, we learn an effective self-supervised representation of fMRI data using mask modeling in a large latent space inspired by the sparse coding of information in the primary visual cortex. Then by augmenting a latent diffusion model with double-conditioning, we show that MinD-Vis can reconstruct highly plausible images with semantically matching details from brain recordings using very few paired annotations. We benchmarked our model qualitatively and quantitatively; the experimental results indicate that our method outperformed state-of-the-art in both semantic mapping (100-way semantic classification) and generation quality (FID) by 66 also conducted to analyze our framework.

READ FULL TEXT

page 1

page 5

page 6

page 8

page 12

page 13

page 14

page 15

research
05/17/2023

Controllable Mind Visual Diffusion Model

Brain signal visualization has emerged as an active research area, servi...
research
05/19/2023

Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity

Reconstructing human vision from brain activities has been an appealing ...
research
04/30/2022

Directly wireless communication of human minds via non-invasive brain-computer-metasurface platform

Brain-computer interfaces (BCIs), invasive or non-invasive, have project...
research
03/26/2023

Semantic Neural Decoding via Cross-Modal Generation

Semantic neural decoding aims to elucidate the cognitive processes of th...
research
09/03/2019

Brain2Char: A Deep Architecture for Decoding Text from Brain Recordings

Decoding language representations directly from the brain can enable new...
research
08/20/2022

Net2Brain: A Toolbox to compare artificial vision models with human brain responses

We introduce Net2Brain, a graphical and command-line user interface tool...

Please sign up or login with your details

Forgot password? Click here to reset