read: UNet

U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI 2015)

Table of contents

Arxiv

Notes

UNet 常用于图像 Segmentation。

  1. CNN的目标是提取特征(减少冗余信息),经过多个pooling层(以及strides>1),最后的 feature map 的尺寸(分辨率)是最小的,
  2. 但图像分割任务需要为图片的每个像素判断 label,所以需要把 CNN 最后输出的 feature map 恢复至原来的尺寸,
  3. U-Net 通过逐级 upsampling (插值) 得到了与cnn对称的feature maps。
  4. 然后就可以把 CNN 中间过程产生的分辨率较高的 feature maps 与对应的 upsampled feature maps 结合起来,从而输出更精确的 segmentation map。
U-Net architecture Ronneberger 2015
  • 左侧是 CNN “收缩"路径(encoder),右侧是"扩展"路径(decoder)。

    左侧 CNN 的每一级做两次 conv3x3 (unpadded),然后ReLU激活并通过 2x2 max-pooling 做下采样, 每次下采样会把 通道数量加倍

    右侧每次先对 feature map 做 2 倍上采样和两次 con3x3 把通道数减一半。

    灰色箭头是 concatenation(Resnet 的 skipconnect 是直接相加,不是拼接),把来自cnn的feature map 的边缘裁剪一下,拼接到右侧的feature map上。最后的feature map 做conv1x1 把64通道变换到所需的类别个数

  • 上采样不会增加(恢复)空间信息

    skip connection的原理是什么?为什么U-net中要用到skip connection?-akkaze-郑安坤的回答

U‐net architecture for the forest types segmentation, adapted from Ronneberger et al. (2015). The number of channels is indicated above the cuboids and the vertical numbers indicate the row and column size in pixels.


(2023-07-10)

Example

Tensorflow: CV2020 - 16 - Object Segmentation.ipynb

PyTorch: U-Net: Training Image Segmentation Models in PyTorch

  1. Segmentation needs to give a label for each pixel, so the output should have the same size as the input image.

  2. The hidden feature vector has lost spatial information along contracting. And up-sampling (interpolation) doesn’t restore the location of features, but just kind of “copy” the features to around pixel.

  3. The feature on each pixel is concatenated with the feature before contracting which still locates in its original position. Then the convolution later on will “fuse” one pixel’s location “characteristic” into its feature vector.

  4. The output feature map is an expansion of the compact hidden featuer map, but conditioned with spatial location.

This way the trained model can classify a pixel based on spatial position and surrouding RGB feature.

U-Net Explained: Understanding its Image Segmentation Architecture - medium

Built with Hugo
Theme Stack designed by Jimmy