Natural scene reconstruction from fMRI signals using generative latent diffusion-Reference-Cited by-同舟云学术

Natural scene reconstruction from fMRI signals using generative latent diffusion

Published:2023-09-20 Issue:1 Volume:13 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Ozcelik Furkan,VanRullen Rufin^ORCID

Abstract

AbstractIn neural decoding research, one of the most intriguing topics is the reconstruction of perceived natural images based on fMRI signals. Previous studies have succeeded in re-creating different aspects of the visuals, such as low-level properties (shape, texture, layout) or high-level features (category of objects, descriptive semantics of scenes) but have typically failed to reconstruct these properties together for complex scene images. Generative AI has recently made a leap forward with latent diffusion models capable of generating high-complexity images. Here, we investigate how to take advantage of this innovative technology for brain decoding. We present a two-stage scene reconstruction framework called “Brain-Diffuser”. In the first stage, starting from fMRI signals, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Variational Autoencoder) model. In the second stage, we use the image-to-image framework of a latent diffusion model (Versatile Diffusion) conditioned on predicted multimodal (text and visual) features, to generate final reconstructed images. On the publicly available Natural Scenes Dataset benchmark, our method outperforms previous models both qualitatively and quantitatively. When applied to synthetic fMRI patterns generated from individual ROI (region-of-interest) masks, our trained model creates compelling “ROI-optimal” scenes consistent with neuroscientific knowledge. Thus, the proposed methodology can have an impact on both applied (e.g. brain–computer interface) and fundamental neuroscience.

Funder

Agence Nationale de la Recherche

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-023-42891-8.pdf

Reference51 articles.

1. Thirion, B. et al. Inverse retinotopy: Inferring the visual content of images from brain activation patterns. Neuroimage 33, 1104–1116 (2006).

2. Kamitani, Y. & Tong, F. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8, 679–685 (2005).

3. Haynes, J.-D. & Rees, G. Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat. Neurosci. 8, 686–691 (2005).

4. Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430 (2001).

5. Cox, D. D. & Savoy, R. L. Functional magnetic resonance imaging (FMRI) “brain reading’’: Detecting and classifying distributed patterns of FMRI activity in human visual cortex. Neuroimage 19, 261–270 (2003).

Cited by 22 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Decoding dynamic visual scenes across the brain hierarchy;PLOS Computational Biology;2024-08-02

2. Decoding dynamic visual scenes across the brain hierarchy;2024-06-28

3. Movie reconstruction from mouse visual cortex activity;2024-06-21

4. MindLDM: Reconstruct Visual Stimuli from fMRI Using Latent Diffusion Model;2024 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA);2024-06-14

5. Large-scale foundation models and generative AI for BigData neuroscience;Neuroscience Research;2024-06