hizhangp / yolo_tensorflow

Tensorflow implementation of YOLO, including training and test phase.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Two question of three lines of codes.

XiangqianMa opened this issue · comments

image = cv2.resize(image, (self.image_size, self.image_size))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
image = (image / 255.0) * 2.0 - 1.0)

I have two questions for these three lines of codes.

  1. After 'resize' operation, why do we need to use the cvColor function?

  2. What's the function of the third code?

I want to train my own data, but my sample's resolution is too low. After i do these tree operation, the result is terrible. So i just want to resize my sample without the last two operation. But after I read your code, I don't know if there are any influence if I remove them.(I am not a native English speaker, thanks for your answer.)

1- image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32):
We know image has three channels, R for Red, B for Blue and G for Green. When we read image using cv2 : imageMatrix = cv2.imread(imagePath) . The Blue channel from RGB comes first, followed by Green and then Red. So we convert it from BGR to RGB using the line mentioned at the start of this ans.

2- image = (image / 255.0) * 2.0 - 1.0)
This line is for just normalizing the data which helps the optimization algorithm like Gradient Descent or Adam in fast convergence.

1st point does not hurt training of model, while the 2nd point does not hugely impact the training process.

@Madi200 Thanks for your answer !