Two question of three lines of codes.

Question

Two question of three lines of codes.

XiangqianMa opened this issue 6 years ago · comments

image = cv2.resize(image, (self.image_size, self.image_size))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
image = (image / 255.0) * 2.0 - 1.0)

I have two questions for these three lines of codes.

After 'resize' operation, why do we need to use the cvColor function?
What's the function of the third code?

I want to train my own data, but my sample's resolution is too low. After i do these tree operation, the result is terrible. So i just want to resize my sample without the last two operation. But after I read your code, I don't know if there are any influence if I remove them.(I am not a native English speaker, thanks for your answer.)

Hammad · Answer 1 · Thu Jun 13 2019 02:56:01 GMT+0800 (China Standard Time)

1- image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32):
We know image has three channels, R for Red, B for Blue and G for Green. When we read image using cv2 : imageMatrix = cv2.imread(imagePath) . The Blue channel from RGB comes first, followed by Green and then Red. So we convert it from BGR to RGB using the line mentioned at the start of this ans.

2- image = (image / 255.0) * 2.0 - 1.0)
This line is for just normalizing the data which helps the optimization algorithm like Gradient Descent or Adam in fast convergence.

1st point does not hurt training of model, while the 2nd point does not hugely impact the training process.

XiangqianMa · Answer 2 · Fri Jun 14 2019 21:35:12 GMT+0800 (China Standard Time)

@Madi200 Thanks for your answer !