TRI-ML / KP2D

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Normalization of image inputs for network

mbcel opened this issue · comments

commented

Thank you for the great work!

Looking into the evaluate_keypoint_net function I can see that you use the to_color_normalized function to apparently normalize the input images. However this function does:

    images -= 0.5
    images *= 0.225

So as far as I understand this correctly this would need an input image in the range [0, 1] to effectively shift the center to zero, i.e. [-0.5, 0.5]. After that it is multiplied by 0.225.

Here two questions came up. 1) Where does the image get converted to [0, 1] range before? As far as I know cv2.imread() outputs [0, 255] range and I cannot see any function in between that does convert it to [0, 1] before. This would mean that the normalization does not work properly.
2) How come, you afterwards use the factor of 0.225. To me this is a rather random factor and would not lead to a normalization range of [-1, 1]. Is this the std of your used dataset?

So what am I missing here?

transforms.ToTensor will norm image to [0,1]....

commented

ah okay that makes sense.

I am still wondering what's the derivation behind the 0.225 factor.

Excuse me, do you know what 0.225 stands for now