Mind Video logo

Mind Video

Mind-Video decodes brain signals to generate high-quality, semantically accurate videos from visual fMRI data.
Visit website
Share this
Mind Video

What is Mind Video?

"Mind-Video" is a two-module pipeline designed to bridge the gap between image and video brain decoding. It progressively learns from brain signals to gain a deeper understanding of the semantic space through multiple stages. The model leverages unsupervised learning with masked brain modeling to learn visual fMRI features and distills semantic-related features using contrastive learning in the CLIP space. The learned features are fine-tuned through co-training with a stable diffusion model tailored for video generation under fMRI guidance. The results show high-quality video reconstructions with accurate semantics, outperforming previous approaches in both semantic and pixel metrics.

Who created Mind Video?

Mind Video was created by Zijiao Chen, Jiaxin Qing, Tiange Xiang, and Prof. Juan Helen Zhou. It was launched on June 18, 2024. Mind Video is developed by Mind-X, a research interest group focusing on multimodal brain decoding with large models. The group aims to advance brain decoding using recent advancements in large models and AGI, with a goal to develop general-purpose brain decoding models for applications in brain-computer interface, neuroimaging, and neuroscience.

What is Mind Video used for?

  • Reconstructing human vision from brain activities to understand cognitive processes
  • Recovering continuous visual experiences in the form of videos
  • Enhancing generation consistency in the process
  • Bridging the gap between image and video brain decoding
  • Learning semantic-related features using annotated datasets
  • Generating accurate reconstructions of videos
  • Training separately and fine-tuning together the fMRI encoder and stable diffusion model
  • Progressively learning from brain signals
  • Developing general-purpose brain decoding models
  • Empowering various applications in brain-computer interface, neuroimaging, and neuroscience
  • Improved understandability of cognitive process
  • Demonstrates visual cortex dominance
  • Hierarchical encoder layer operation
  • Volume and time-frame preservation
  • Masked brain modelling application
  • Large-scale unsupervised learning approach
  • Multi-modal contrastive learning employed
  • Progressive semantic learning
  • Analytical attention analysis
  • Outperforms previous approaches by 45%

Who is Mind Video for?

  • Neuroscientists
  • Researchers in Brain-Computer Interface
  • Neuroimaging Scientists
  • Neuroscience
  • Neuroimaging
  • Brain-computer interface specialists
  • Neuroimaging researchers
  • Neurologists
  • Psychiatrists

How to use Mind Video?

To use Mind-Video, follow these steps:

  1. Understanding the Purpose: Mind-Video aims to bridge the gap between image and video brain decoding by introducing a two-module pipeline designed for this purpose.

  2. Model Training: The Mind-Video model consists of two modules trained separately and then fine-tuned together. The first module focuses on leveraging unsupervised learning techniques to learn general visual fMRI features and distill semantic-related features using annotated datasets. The second module involves fine-tuning learned features through co-training with an augmented stable diffusion model tailored for video generation under fMRI guidance.

  3. Progressive Learning: The model progressively learns from brain signals, gaining a deeper understanding of semantic space through multiple stages in the first module, including multimodal contrastive learning with spatiotemporal attention for windowed fMRI.

  4. Results Evaluation: Mind-Video has shown promising results, with high-quality videos generated, accurate semantics captured, and significant improvements in semantic metrics and SSIM over previous approaches.

  5. Contribution to the Field: Mind-Video introduces a flexible and adaptable brain decoding pipeline, with an emphasis on learning accuracy, semantic relevance, and model interpretability.

By following these steps, users can effectively utilize Mind-Video for image-to-video brain decoding applications.

Pros
  • Flexible and adaptable brain decoding pipeline
  • Enhanced understanding of semantic space through multiple stages
  • Attention analysis mapping to visual cortex and cognitive networks
  • High-quality videos with accurate semantics
  • Progressive learning scheme for brain features
  • Biologically plausible and interpretable model
  • Outperformed previous state-of-the-art approaches by 45%
  • Achieved accuracy of 85% in semantic metrics
  • Generation of high-quality videos with accurate semantics
  • Progressive learning scheme for brain feature learning
  • High-quality video generation
  • Progressive learning scheme
  • Flexible and adaptable structure
  • Two-module pipeline design
  • Co-trains encoder and model
Cons
  • Requires large-scale fMRI data
  • Dependent on quality of data
  • Complex two-module pipeline
  • Extensive training periods
  • Relies on annotated dataset
  • Requires fine-tuning processes
  • Transformer hierarchy can complicate processes
  • Semantics learning is gradual
  • Dependent on specific diffusion model
  • Focus on visual cortex not universally applicable
  • Dependent on the quality of data
  • Dependant on quality of data

Mind Video FAQs

What is Mind-Video?
Mind-Video is a two-module pipeline designed to bridge the gap between image and video brain decoding, with separate training and fine-tuning of the modules.
How does Mind-Video learn from brain signals?
The model learns from brain signals progressively through multiple stages in the first module, leveraging unsupervised learning and spatiotemporal attention.
What is the contribution of Mind-Video?
Mind-Video introduced a flexible brain decoding pipeline with separate modules for fMRI encoding and stable diffusion model, achieving high-quality videos with accuracy in semantics and pixel metrics.
What insights were gained from the attention analysis in brain decoding?
The attention analysis of the transformers decoding fMRI data yielded three significant insights related to brain decoding.
What motivated the development of Mind-Video?
Mind-Video was developed to address gaps in video reconstruction compared to previous image reconstruction work, with a focus on enhancing generation consistency and dynamics preservation within one fMRI frame.
Who are acknowledged in the development of Mind-Video?
Acknowledgments include Tiange Xiang, Jonathan Xu, members of the Multimodal Neuroimaging in Neuropsychiatric Disorders Laboratory, the Human Connectome Project (HCP), Prof. Zhongming Liu, Dr. Haiguang Wen, the Stable Diffusion team, and the Tune-A-Video team.

Get started with Mind Video

Mind Video reviews

How would you rate Mind Video?
What’s your thought?
Be the first to review this tool.

No reviews found!