OpenGVLab / SAM-Med2D

Official implementation of SAM-Med2D

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why take 3 channels png image as input?

skywalker0523 opened this issue · comments

Thank you very much for the excellent work! I have a question I'd like to ask you. The 2D slices of 3D nii images have 1 channel, but you converted these slices into PNG images with 3 channels. Additionally, the mean and std differ across these three channels,mean:[123.675, 116.28, 103.53],std:[58.395, 57.12, 57.375]. Could you please explain how this conversion was done? I intend to fine-tune your model on my own dataset.

Converting the image to a three-channel format is done to accommodate the input format of the original ViT model. In this case, converting to a three-channel image can be understood as replicating the original single-channel image three times. The normalization operation on the image also follows the parameters used in the original SAM. When fine-tuning your own network, you can set different mean and standard deviation values, such as [0.5], [0.5].