About training

Question

About training

long123524 opened this issue 3 years ago · comments

I trained on my own data set, and the following result graph appeared (I am doing a binary classification task). How can I improve it? Please answer

Weijian Xu · Answer 1 · Mon Jul 05 2021 13:03:02 GMT+0800 (China Standard Time)

It looks a bit weird. Did you use the VGG pre-trained weights?

Long · Answer 2 · Mon Jul 05 2021 14:41:09 GMT+0800 (China Standard Time)

I didn't use it. Do I have to use pre-trained weights to proceed? Because my original image is in 6 bands, adding pre-training weights will report an error

Weijian Xu · Answer 3 · Tue Jul 06 2021 05:23:44 GMT+0800 (China Standard Time)

I didn't use it. Do I have to use pre-trained weights to proceed? Because my original image is in 6 bands, adding pre-training weights will report an error

Hi @long123524,
For 6 bands, do you mean your input has 6 channels? If that's the case, you can try to load the weights for the VGG backbone except the first layer.

Long · Answer 4 · Wed Jul 07 2021 08:36:23 GMT+0800 (China Standard Time)

I try to load the weights for the VGG backbone except the first layer, however, the result is not very satisfactory.

I trained 150 epoch,The picture shows the result of the 150th epoch,What else do I need to do to get good results?

Long · Answer 5 · Wed Jul 07 2021 09:35:06 GMT+0800 (China Standard Time)

I try to load the weights for the VGG backbone include the first layer. The training data is BSDS, and the result is obviously different from my own data set. The VGG weight of the first layer should be more important. How should I modify the weight of the first layer of pre-training so that it can read the 6-channel data set?

Weijian Xu · Answer 6 · Wed Jul 07 2021 11:21:31 GMT+0800 (China Standard Time)

Hi @long123524, I wonder if you can try to load the weights for the VGG backbone excluding the first layer and train on BSDS dataset? This can help to check if the parameters in the first layer are indeed essential.

Long · Answer 7 · Wed Jul 07 2021 14:29:05 GMT+0800 (China Standard Time)

After removing the first-layer weight of VGG, the BSDS data set will not get good training results or without pre-training VGG, the effect is not good, as shown in the picture.

Therefore, I think VGG pre-training weights are necessary, and the first layer is also necessary. My idea is to change the first layer of VGG to 6 channels, but how to modify such a pre-training weight file? Or how did this pre-training weight file come from? Can we make a pre-trained VGG ourselves?

Weijian Xu · Answer 8 · Fri Jul 09 2021 12:18:39 GMT+0800 (China Standard Time)

Hi @long123524, the pre-trained weight file comes from original HED repo, and I think this weight file is pretrained on ImageNet classification (may need to double check). It seems the first layer is necessary in your visualization. However, I am not sure if there is a easy way to convert the current 3-channel input layer to 6-channel. In addition, perhaps the 6-channel input of your dataset has different value distribution from the RGB images in HED. Could you tell me what's the meaning of each channel in your data? We can design some conversion given the property of your data.

Long · Answer 9 · Sat Jul 10 2021 14:49:38 GMT+0800 (China Standard Time)

Thank you very much for your reply. My own research direction is remote sensing. The 6 channels I input are the 6 bands in the remote sensing image, and each band has its own spectral information in it.

Weijian Xu · Answer 10 · Sun Jul 11 2021 11:15:44 GMT+0800 (China Standard Time)

Gotcha, I would suggest if you can compute the statistics of the 6 channels (e.g. max/min/std/avg) and choose suitable normalization. Afterward, you can attempt to copy the 3-channel input layer weights twice to make 6-channel input weights and see if this way can help training.

Long · Answer 11 · Sun Jul 11 2021 14:03:40 GMT+0800 (China Standard Time)

I successfully got the code to run, but the loss I got was very large, already tens of thousands. Why is this?

Weijian Xu · Answer 12 · Mon Jul 12 2021 07:09:28 GMT+0800 (China Standard Time)

I think you may attempt to lower down the learning rate and try several possible values.