Journal of Bionic Engineering ›› 2024, Vol. 21 ›› Issue (3): 1511-1521.doi: 10.1007/s42235-024-00513-7

• • 上一篇    下一篇

Rethinking the Encoder–decoder Structure in Medical Image Segmentation from Releasing Decoder Structure

Jiajia Ni1,2; Wei Mu2; An Pan2; Zhengming Chen2   

  1. 1 School of Artificial Intelligence, Anhui Polytechnic University, Wuhu 241000, China
    2 School of Information Science and Engineering, HoHai University, Changzhou 213000, China
  • 出版日期:2024-05-20 发布日期:2024-06-08
  • 通讯作者: Jiajia Ni E-mail:jj.ni@ahpu.edu.cn
  • 作者简介:Jiajia Ni1,2; Wei Mu2; An Pan2; Zhengming Chen2

Rethinking the Encoder–decoder Structure in Medical Image Segmentation from Releasing Decoder Structure

Jiajia Ni1,2; Wei Mu2; An Pan2; Zhengming Chen2   

  1. 1 School of Artificial Intelligence, Anhui Polytechnic University, Wuhu 241000, China
    2 School of Information Science and Engineering, HoHai University, Changzhou 213000, China
  • Online:2024-05-20 Published:2024-06-08
  • Contact: Jiajia Ni E-mail:jj.ni@ahpu.edu.cn
  • About author:Jiajia Ni1,2; Wei Mu2; An Pan2; Zhengming Chen2

摘要: Medical image segmentation has witnessed rapid advancements with the emergence of encoder–decoder based methods.
In the encoder–decoder structure, the primary goal of the decoding phase is not only to restore feature map resolution, but
also to mitigate the loss of feature information incurred during the encoding phase. However, this approach gives rise to a
challenge: multiple up-sampling operations in the decoder segment result in the loss of feature information. To address this
challenge, we propose a novel network that removes the decoding structure to reduce feature information loss (CBL-Net). In
particular, we introduce a Parallel Pooling Module (PPM) to counteract the feature information loss stemming from conventional
and pooling operations during the encoding stage. Furthermore, we incorporate a Multiplexed Dilation Convolution
(MDC) module to expand the network's receptive field. Also, although we have removed the decoding stage, we still need
to recover the feature map resolution. Therefore, we introduced the Global Feature Recovery (GFR) module. It uses attention
mechanism for the image feature map resolution recovery, which can effectively reduce the loss of feature information.
We conduct extensive experimental evaluations on three publicly available medical image segmentation datasets: DRIVE,
CHASEDB and MoNuSeg datasets. Experimental results show that our proposed network outperforms state-of-the-art
methods in medical image segmentation. In addition, it achieves higher efficiency than the current network of coding and
decoding structures by eliminating the decoding component.

关键词: Medical image segmentation · Encoder–decoder architecture · Attention mechanisms · Releasing decoder architecture · Neural network

Abstract: Medical image segmentation has witnessed rapid advancements with the emergence of encoder–decoder based methods.
In the encoder–decoder structure, the primary goal of the decoding phase is not only to restore feature map resolution, but
also to mitigate the loss of feature information incurred during the encoding phase. However, this approach gives rise to a
challenge: multiple up-sampling operations in the decoder segment result in the loss of feature information. To address this
challenge, we propose a novel network that removes the decoding structure to reduce feature information loss (CBL-Net). In
particular, we introduce a Parallel Pooling Module (PPM) to counteract the feature information loss stemming from conventional
and pooling operations during the encoding stage. Furthermore, we incorporate a Multiplexed Dilation Convolution
(MDC) module to expand the network's receptive field. Also, although we have removed the decoding stage, we still need
to recover the feature map resolution. Therefore, we introduced the Global Feature Recovery (GFR) module. It uses attention
mechanism for the image feature map resolution recovery, which can effectively reduce the loss of feature information.
We conduct extensive experimental evaluations on three publicly available medical image segmentation datasets: DRIVE,
CHASEDB and MoNuSeg datasets. Experimental results show that our proposed network outperforms state-of-the-art
methods in medical image segmentation. In addition, it achieves higher efficiency than the current network of coding and
decoding structures by eliminating the decoding component.

Key words: Medical image segmentation · Encoder–decoder architecture · Attention mechanisms · Releasing decoder , architecture · Neural network