Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2025, 8(11); doi: 10.25236/AJCIS.2025.081105.

Multi-Branch Medical Transformer for SPECT Myocardial Perfusion Imaging: A Novel Approach to Diagnosis

Author(s)

Fuling Zhao1, Xuande Zhang1, Long Xu2, Xin Huang2

Corresponding Author:
Xin Huang
Affiliation(s)

1School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an, Shaanxi, China

2Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, Zhejiang, China

Abstract

This study introduces a novel deep learning approach to enhance the accuracy and efficiency of diagnosis in single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI). To address key limitations of current convolutional neural network (CNN)-based methods—such as insufficient information capture, difficulty in removing redundant features, and limited capacity for modeling long-range dependencies—we reconstruct the three-dimensional structure of myocardial perfusion images in a stacked format and propose a multi-branch medical transformer network. This architecture extracts comprehensive features from different anatomical views while integrating critical information, leveraging the Transformer's strength in capturing long-range dependencies to overcome traditional CNN shortcomings. Experimental results demonstrate that the proposed method consistently outperforms conventional CNN-based models across multiple evaluation metrics, achieving improved feature extraction and higher diagnostic accuracy. Comparative experiments and ablation studies further validate the effectiveness of the multi-branch Transformer architecture. The proposed multi-branch vision transformer provides a powerful tool for automated SPECT MPI diagnosis, enhancing diagnostic performance and offering potential support for clinical decision-making.

Keywords

Myocardial Perfusion Imaging, Tomography, Emission-Computed, Single-Photon, Vision Transformer, Coronary Artery Disease, Deep Learning

Cite This Paper

Fuling Zhao, Xuande Zhang, Long Xu, Xin Huang. Multi-Branch Medical Transformer for SPECT Myocardial Perfusion Imaging: A Novel Approach to Diagnosis. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 11: 38-51. https://doi.org/10.25236/AJCIS.2025.081105.

References

[1] Stark B, Johnson C, Roth G. Global prevalence of coronary artery disease: an update from the Global Burden of Disease Study. J Am Coll Cardiol 2024;83(13 Suppl):2320. doi:10.1016/S0735-1097(24)04310-9.

[2] Liu X, Wu Y, Li F, Qi X, Niu L, Wu Y, et al. Global burden of early-onset ischemic heart disease, 1990 to 2019. JACC Adv 2025;4(1):101466. doi:10.1016/j.jacadv.2024.101466.

[3] Notghi A, Low CS. Myocardial perfusion scintigraphy: past, present and future. Br J Radiol 2011;84(Spec Iss 3):S229–S236. doi:10.1259/bjr/14625142.

[4] International Atomic Energy Agency. Nuclear cardiology: guidance on the implementation of SPECT myocardial perfusion imaging. IAEA Human Health Series No. 23 (Rev. 1). Vienna: IAEA; 2016.

[5] Zhang H, Qie Y. Applying deep learning to medical imaging: a review. Appl Sci 2023; 13:10521. doi:10.3390/app131810521.

[6] Yu H, Yang LT, Zhang Q, Armstrong D, Deen MJ. Convolutional neural networks for medical image analysis: state-of-the-art, comparisons, improvement and perspectives. Neurocomputing 2021; 444:92–110. doi:10.1016/j.neucom.2020.04.157.

[7] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit; 2016. p. 770–778. doi:10.1109/CVPR.2016.90.

[8] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[9] Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recognit 2015;1–9. doi:10.1109/CVPR.2015.7298594

[10] Zhao Z, Alzubaidi L, Zhang J, Duan Y, Gu Y. A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations. Expert Syst Appl 2024; 242:122807. doi:10.1016/j.eswa.2023.122807

[11] Ellis RJ, Sander RM, Limon A. Twelve key challenges in medical machine learning and solutions. Intell Based Med 2022; 6:100068. doi:10.1016/j.ibmed.2022.100068

[12] Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng 2010; 22:1345–1359. doi:10.1109/TKDE.2009.191

[13] Amini MR, Feofanov V, Pauletto L, Hadjadj L, Devijver É, Maximov Y. Self-training: a survey. Neurocomputing 2025; 616:128904. doi:10.1016/j.neucom.2024.128904

[14] Jiao J, Droste R, Drukker L, Papageorghiou AT, Noble JA. Self-supervised representation learning for ultrasound video. IEEE Int Symp Biomed Imaging 2020:1847–1850. doi:10.1109/ISBI45749.2020.9098666

[15] Lopes RR, Bleijendaal H, Ramos LA, Verstraelen TE, Amin AS, Wilde AA, et al. Improving electrocardiogram-based detection of rare genetic heart disease using transfer learning: An application to phospholamban p. Arg14del mutation carriers. Comput Biol Med 2021; 131:104262. doi:10.1016/j.compbiomed.2021.104262

[16] Kathamuthu ND, Subramaniam S, Le QH, Muthusamy S, Panchal H, Sundararajan SCM, et al. A deep transfer learning-based convolution neural network model for COVID-19 detection using computed tomography scan images for medical applications. Adv Eng Softw 2023; 175:103317. doi:10.1016/j.advengsoft.2022.103317

[17] Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. Proc IEEE Conf Comput Vis Pattern Recognit 2009; 248–255. doi:10.1109/CVPR.2009.5206848

[18] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 5998–6008.

[19] Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

[20] Murphy ZR, Venkatesh K, Sulam J, Yi PH. Visual transformers and convolutional neural networks for disease classification on radiographs: a comparison of performance, sample efficiency, and hidden stratification. Radiol Artif Intell 2022; 4(6):e220012. doi:10.1148/ryai.220012.

[21] Pachetti E, Colantonio S, Pascali MA. On the effectiveness of 3D vision transformers for the prediction of prostate cancer aggressiveness. In: Proc Int Conf Image Anal Process; 2022. p. 317–328. doi:10.1007/978-3-031-13324-4_27.

[22] Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet? In: Proc IEEE Conf Comput Vis Pattern Recognit; 2018. p. 6546–6555. doi:10.1109/CVPR.2018.00684

[23] Chen X, Liu C. Deep-learning-based methods of attenuation correction for SPECT and PET. J Nucl Cardiol 2023;30(5):1859-1878.

[24] Nguyen TT, Chi TN, Hoang MD, Thai HN, Duc TN. 3D Unet generative adversarial network for attenuation correction of SPECT images. In: 2020 4th international conference on recent advances in signal processing, telecommunications & computing (SigTelCom). Hanoi, Vietnam: IEEE, 2020:93–97.

[25] Chen X, Hendrik Pretorius P, Zhou B, Liu H, Johnson K, Liu YH, King MA, Liu C. Cross-vender, cross-tracer, and cross-protocol deep transfer learning for attenuation map generation of cardiac SPECT. J Nucl Cardiol 2022;29(6):3379-3391. 

[26] Shanbhag AD, Miller RJH, Pieszko K, Lemley M, Kavanagh P, Feher A, et al. Deep learning-based attenuation correction improves diagnostic accuracy of cardiac SPECT. J Nucl Med 2023;64(3):472-478.

[27] Apostolopoulos ID, Papandrianos NI, Feleki A, Moustakidis S, Papageorgiou EI. Deep learning-enhanced nuclear medicine SPECT imaging applied to cardiac studies. EJNMMI Phys 2023;10(1):6.

[28] Shiri I, AmirMozafari Sabet K, Arabi H, Pourkeshavarz M, Teimourian B, Ay MR, et al. Standard SPECT myocardial perfusion estimation from half-time acquisitions using deep convolutional residual neural networks. J Nucl Cardiol 2021;28(6):2761-2779.

[29] Pan Z, Qi N, Meng Q, Pan B, Feng T, Zhao J, Gong NJ. Fast SPECT/CT planar bone imaging enabled by deep learning enhancement. Med Phys 2024;51(8):5414–5426.

[30] Song C, Yang Y, Wernick MN, Pretorius PH, King MA. Low-dose cardiac-gated SPECT studies using a residual convolutional neural network. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019). Venice, Italy: IEEE, 2019:653–656.

[31] Ramon AJ, Yang Y, Pretorius PH, Johnson KL, King MA, Wernick MN. Improving diagnostic accuracy in low-dose SPECT myocardial perfusion imaging with convolutional denoising networks. IEEE Trans Med Imaging 2020; 39:2893–2903.

[32] Wu R, Liu H, Lai P, Yuan W, Li H, Jiang Y. Sinogram-characteristic-informed network for efficient restoration of low-dose SPECT projection data. Med Phys 2025;52(1):414–432.

[33] Berkaya SK, Sivrikoz IA, Gunal S. Classification models for SPECT myocardial perfusion imaging. Comput Biol Med 2020; 123:103893. doi:10.1016/j.compbiomed.2020.103893

[34] Magboo VPC, Magboo MSA. Diagnosis of coronary artery disease from myocardial perfusion imaging using convolutional neural networks. Procedia Comput Sci 2023; 218:810-817. doi:10.1016/j.procs.2023.01.061

[35] Magboo VPC, Magboo MSA. SPECT-MPI for coronary artery disease: a deep learning approach. Acta Med Philipp 2024 May 15;58(8):67-75. doi: 10.47895/amp.vi0.7582. PMID: 38812768; PMCID: PMC11132284.

[36] Kusumoto D, Akiyama T, Hashimoto M, et al. A deep learning-based automated diagnosis system for SPECT myocardial perfusion imaging. Sci Rep 2024; 14:13583. doi: 10.1038/s41598-024-64445-2.

[37] Xie Y, Zhang J, Xia Y, Wu Q. UniMiSS: Universal medical self-supervised learning via breaking dimensionality barrier. In: Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, eds. Computer Vision – ECCV 2022. Lecture Notes in Computer Science, vol 13681. Cham: Springer; 2022. p. 555–572. doi:10.1007/978-3-031-19803-8_33.

[38] Wang W, Xie E, Li X, et al. Pyramid Vision Transformer: A versatile backbone for dense prediction without convolutions. In: Proc IEEE/CVF Int Conf Comput Vis; 2021. p. 568–578. doi:10.1109/ICCV48922.2021.00062.

[39] Caron M, Touvron H, Misra I, et al. Emerging properties in self-supervised vision transformers. In: Proc IEEE/CVF Int Conf Comput Vis; 2021. p. 9650–9660. doi:10.1109/ICCV48922.2021.00951.

[40] Chen X, Xie S, He K. An empirical study of training self-supervised vision transformers. In: Proc IEEE/CVF Int Conf Comput Vis; 2021. p. 9640–9649. doi:10.1109/ICCV48922.2021.00950.

[41] Loschilov I, Hutter F. Fixing weight decay regularization in Adam. In: Proc Int Conf Learn Representations; 2018.

[42] Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proc IEEE Conf Comput Vis Pattern Recognit; 2017. p. 2261–2269. doi:10.1109/CVPR.2017.243.

[43] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception architecture for computer vision. Proc IEEE Conf Comput Vis Pattern Recognit 2016; 2818–2826.