Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning
Episode

Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning

Dec 22, 20259:14
Computer Vision and Pattern Recognitionphysics.med-ph
No ratings yet

Abstract

Background: High-resolution MRI is critical for diagnosis, but long acquisition times limit clinical use. Super-resolution (SR) can enhance resolution post-scan, yet existing deep learning methods face fidelity-efficiency trade-offs. Purpose: To develop a computationally efficient and accurate deep learning framework for MRI SR that preserves anatomical detail for clinical integration. Materials and Methods: We propose a novel SR framework combining multi-head selective state-space models (MHSSM) with a lightweight channel MLP. The model uses 2D patch extraction with hybrid scanning to capture long-range dependencies. Each MambaFormer block integrates MHSSM, depthwise convolutions, and gated channel mixing. Evaluation used 7T brain T1 MP2RAGE maps (n=142) and 1.5T prostate T2w MRI (n=334). Comparisons included Bicubic interpolation, GANs (CycleGAN, Pix2pix, SPSR), transformers (SwinIR), Mamba (MambaIR), and diffusion models (I2SB, Res-SRDiff). Results: Our model achieved superior performance with exceptional efficiency. For 7T brain data: SSIM=0.951+-0.021, PSNR=26.90+-1.41 dB, LPIPS=0.076+-0.022, GMSD=0.083+-0.017, significantly outperforming all baselines (p<0.001). For prostate data: SSIM=0.770+-0.049, PSNR=27.15+-2.19 dB, LPIPS=0.190+-0.095, GMSD=0.087+-0.013. The framework used only 0.9M parameters and 57 GFLOPs, reducing parameters by 99.8% and computation by 97.5% versus Res-SRDiff, while outperforming SwinIR and MambaIR in accuracy and efficiency. Conclusion: The proposed framework provides an efficient, accurate MRI SR solution, delivering enhanced anatomical detail across datasets. Its low computational demand and state-of-the-art performance show strong potential for clinical translation.

Summary

This paper addresses the challenge of long acquisition times in MRI, which limits clinical use despite the modality's excellent soft-tissue contrast. The authors propose an efficient deep learning framework for MRI super-resolution (SR) using a novel Vision Mamba architecture. This architecture combines multi-head selective state-space models (MHSSM) with a lightweight channel MLP and employs a hybrid scanning strategy (vertical, horizontal, diagonal) to capture long-range dependencies. The key idea is to achieve high reconstruction fidelity while minimizing computational overhead, making the SR solution practical for clinical integration. The proposed framework was evaluated on 7T brain T1 MP2RAGE maps and 1.5T prostate T2w MRI, and compared against several baselines, including GANs, transformers, Mamba, and diffusion models. The results demonstrate that the proposed model outperforms the baselines in both accuracy and efficiency. Specifically, it achieved superior SSIM, PSNR, LPIPS, and GMSD scores on both datasets. Crucially, the model achieves these results with only 0.9 million parameters and 57 GFLOPs, representing a significant reduction in parameters (99.8%) and computation (97.5%) compared to Res-SRDiff. The authors conclude that their framework provides a computationally efficient yet accurate MRI SR solution with enhanced anatomical detail, showing strong potential for clinical translation due to its low computational demand and state-of-the-art performance. The hybrid scanning strategy and lightweight channel MLP are key contributions.

Key Insights

  • The proposed Vision Mamba framework achieves state-of-the-art MRI super-resolution performance with significantly reduced computational complexity compared to transformer-based and diffusion-based methods.
  • Hybrid selective scanning (vertical, horizontal, diagonal) mitigates pixel forgetting issues inherent in horizontal/vertical-only scanning strategies, improving long-range dependency modeling.
  • The lightweight channel MLP effectively reduces parameter overhead without sacrificing representational power, contributing to the overall efficiency of the model. The MLP expands the channel dimension by a factor of 2.
  • On the 7T brain dataset, the model achieved a 2.1% SSIM gain over SPSR (0.951 vs 0.932) and a 2.4% PSNR improvement over Res-SRDiff (26.90 dB vs 26.28 dB).
  • Compared to Res-SRDiff, the proposed method reduces the number of parameters by 99.8% (0.9M vs 394M) and computational operations by 97.5% (57 GFLOPs vs 2316 GFLOPs).
  • The model is evaluated on two distinct datasets (7T brain T1 maps and 1.5T prostate T2w images), demonstrating its robustness and generalizability across different anatomical regions and MRI contrasts.
  • A limitation is that the framework is implemented as a 2D slice-based approach, which does not enforce volumetric consistency.

Practical Implications

  • The efficient MRI super-resolution framework can reduce MRI acquisition times, improving patient comfort and scanner throughput in clinical settings.
  • Radiologists and clinicians can benefit from enhanced image resolution and anatomical detail, leading to more accurate diagnoses and treatment planning, particularly in neuroimaging and prostate cancer imaging.
  • Practitioners can use the proposed architecture as a starting point for developing SR models for other medical imaging modalities or anatomical regions, adapting the hybrid scanning strategy and lightweight channel MLP for specific applications.
  • Future research can focus on extending the framework to 3D architectures to ensure volumetric coherence, evaluating the model on more diverse datasets, and incorporating uncertainty quantification to address potential artifacts.
  • The reduced computational cost makes the method suitable for deployment on resource-constrained devices or in real-time applications, enabling wider accessibility to high-quality MRI imaging.

Links & Resources

Authors