Non-local transformer,Feature fusion,Collaborative attention,Retinal vessel segmentation,Coronary angiograph segmentation
," /> Non-local transformer,Feature fusion,Collaborative attention,Retinal vessel segmentation,Coronary angiograph segmentation
,"/> Non-local transformer,Feature fusion,Collaborative attention,Retinal vessel segmentation,Coronary angiograph segmentation
,"/> Global–Local Hybrid Modulation Network for Retinal Vessel and Coronary Angiograph Segmentation

Quick Search Adv. Search

Journal of Bionic Engineering ›› 2025, Vol. 22 ›› Issue (4): 2050-2074.doi: 10.1007/s42235-025-00727-3

Previous Articles    

Global–Local Hybrid Modulation Network for Retinal Vessel and Coronary Angiograph Segmentation

Pengfei Cai1;Biyuan Li1,2;Jinying Ma1;Xiao Tian1;Jun Yan3

  

  1. 1 School of Electronic Engineering, Tianjin University ofTechnology and Education, Tianjin 300222, China
    2 Tianjin Development Zone Jingnuohanhai Data TechnologyCo., Ltd, Tianjin, China
    3 School of Mathematics, Tianjin University, Tianjin300072, China
  • Online:2025-06-19 Published:2025-08-31
  • Contact: Biyuan Li E-mail:lby@tute.edu.cn
  • About author:Pengfei Cai1;Biyuan Li1,2;Jinying Ma1;Xiao Tian1;Jun Yan3

Abstract: The segmentation of retinal vessels and coronary angiographs is essential for diagnosing conditions such as glaucoma, diabetes, hypertension, and coronary artery disease. However, retinal vessels and coronary angiographs are characterized by low contrast and complex structures, posing challenges for vessel segmentation. Moreover, CNN-based approaches are limited in capturing long-range pixel relationships due to their focus on local feature extraction, while ViT-based approaches struggle to capture fine local details, impacting tasks like vessel segmentation that require precise boundary detection. To address these issues, in this paper, we propose a Global–Local Hybrid Modulation Network (GLHM-Net), a dual-encoder architecture that combines the strengths of CNNs and ViTs for vessel segmentation. First, the Hybrid NonLocal Transformer Block (HNLTB) is proposed to efficiently consolidate long-range spatial dependencies into a compact feature representation, providing a global perspective while significantly reducing computational overhead. Second, the Collaborative Attention Fusion Block (CAFB) is proposed to more effectively integrate local and global vessel features at the same hierarchical level during the encoding phase. Finally, the proposed Feature Cross-Modulation Block (FCMB) better complements the local and global features in the decoding stage, effectively enhancing feature learning and minimizing information loss. The experiments conducted on the DRIVE, CHASEDB1, DCA1, and XCAD datasets, achieving AUC values of 0.9811, 0.9864, 0.9915, and 0.9919, F1 scores of 0.8288, 0.8202, 0.8040, and 0.8150, and IOU values of 0.7076, 0.6952, 0.6723, and 0.6878, respectively, demonstrate the strong performance of our proposed network for vessel segmentation.

Key words: Non-local transformer')">Non-local transformer, Feature fusion, Collaborative attention, Retinal vessel segmentation, Coronary angiograph segmentation