huangzehao / caffe-vdsr

A Caffe-based implementation of very deep convolution network for image super-resolution

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

loss during training

lemony1314 opened this issue · comments

At first thanks for your work.
I have trained nearly 300000 times by the code you provided. The trained images have been augmented to 5820 .The train_num is 748608.
Then the test loss converges to 0.4 almostly. Is that reasonable?

Hi, 0.4 is ok. You can test your trained model in Set5 or Set14 and check the PSNR, and you can have a look at my training log, https://raw.githubusercontent.com/huangzehao/caffe-vdsr/master/Train/VDSR_291_multiscale_adam.log.

@huangzehao Thanks for your reply.
I find that the PSNR does not become bigger or converge with the iteration increasing.
when I trained 170000 times , the psnr of butteffly is 29.922.
when I trained 250000 times , the psnr of butteffly is 29.898 .
when I trained 360000 times , the psnr of butteffly is 29.917.
when I trained 450000 times , the psnr of butteffly is 29.714.
when I trained 520000 times , the psnr of butteffly is 29.978.
Is it weird???

@ @huangzehao Looking forward to your reply! Thank you !

Hi, sorry for the late reply.
You should benchmark your model in full dataset, including Set5, Set14 and BSD100.
The psnr of single image sometimes can not represent the performance of your model.

Hi, I have a question regarding the training. How many iterations do we need to train a reasonable VDSR model?
According to your discussion above, it seems at least 2*10^5 iterations are needed.
But it is stated in the paper that the training takes less than 4 hours which I think is not quite enough to finish more than 10^5 iterations.

@xuxy09 Hi, check this #29

Thanks. Seems like we have the same concern about the training time. And I agree with you that it is impossible to finish 80 epochs in 4 hours with one Titan Z, especially considering that Caffe is faster than MatConvnet generally. I think 24 hours should be a more reasonable answer.

hello,@huangzehao I also have the same question about the test loss value。 In your Train log,the test loss converges to 0.4 after 15 epoch,it's ok test in Set5,But Set14 and BSD100。So I just want know How many epoch the Set14 and BSD100 datasets can get the same results in paper. Hope for your reply ,Thank you !

@ChaofWang Hi, you can test the trained model in Set14 and BSD100. 20 or 30 epoch is enough.