Different result in disjoint 15-5s setting.

Question

Different result in disjoint 15-5s setting.

kona419 opened this issue 10 months ago · comments

Hello,
I am trying to reproduce disjoint 15-5s setting.
But my result is very different from yours.

My command is :
/home/nayoung/nayoung/MiB/run.py --data_root '/home/nayoung/nayoung/' --batch_size 10 --dataset voc --name MIB --task 15-5s --step 0 --lr 0.01 --epochs 30 --method MiB
for step1~5 :
/home/nayoung/nayoung/MiB/run.py --data_root '/home/nayoung/nayoung/' --batch_size 10 --dataset voc --name MIB --task 15-5s --step 5 --lr 0.001 --epochs 30 --method MiB

I used batch size 10 becuz of cuda memory, and I didn't used the pretrained model.
Also I set the loss_kd=100.

background	aeroplane	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	diningtable	dog	horse	motorbike	person	pottedplant	sheep	sofa	train	tvmonitor
0.857241	0.596404	0.249615	0.489829	0.336007	0.254114	0.694971	0.631736	0.539938	0.124421	0.380302	0.230107	0.470491	0.438303	0.5446194	0.615698
0.822046	0.571647	0.246822	0.475479	0.322084	0.202237	0.607911	0.599167	0.515914	0.123029	0.300344	0.230606	0.447299	0.413728	0.5464546	0.603189	0.06315
0.812532	0.53895	0.238265	0.418296	0.236745	0.17652	0.540683	0.536196	0.477998	0.089279	0.28096	0.100062	0.383524	0.36603	0.5146568	0.589685	0.056601	0.065537
0.523217	0.503853	0.216371	0.287688	0.198159	0.151194	0.494373	0.503627	0.455402	0.093359	0.119011	0.123516	0.33748	0.289346	0.5154741	0.565243	0.049754	0.061803	0.035291
0.424163	0.464728	0.215501	0.285088	0.162308	0.139302	0.465628	0.475487	0.407798	0.062629	0.131808	0.035045	0.331196	0.272611	0.4768657	0.551702	0.04413	0.06458	0.030589	0.110248
0.303423	0.404756	0.196714	0.210973	0.101944	0.115709	0.366374	0.38747	0.39362	0.044943	0.073729	0.031481	0.310951	0.23618	0.4594278	0.545644	0.04088	0.061531	0.026771	0.092094	0.020551

class mIoU	0.51339	0.227215	0.361226	0.226208	0.173179	0.528324	0.522281	0.465112	0.08961	0.214359	0.125136	0.380157	0.336033	0.5095831	0.578527	0.050903	0.063363	0.030884	0.101171	0.020551

1-15 : 0.350022
16-20 : 0.053374
all : 0.27586

Fabio Cermelli · Answer 1 · Mon Oct 16 2023 21:19:00 GMT+0800 (China Standard Time)

Hey! Probably you get different results due to a different batch size...
This setting is particularly challenging due to the non-i.i.d. data, so probably decreasing the BS hampers the performances.

Nayoung Ko · Answer 2 · Tue Oct 31 2023 11:00:20 GMT+0800 (China Standard Time)

Hey! Probably you get different results due to a different batch size... This setting is particularly challenging due to the non-i.i.d. data, so probably decreasing the BS hampers the performances.

Thank you for the reply.
Because of my GPU memory, my batch size has a limit.
So could you recommend other hyperparameters(ex. learning rate, weight decay) for the low batch size to follow the paper's result?

Fabio Cermelli · Answer 3 · Tue Oct 31 2023 16:52:46 GMT+0800 (China Standard Time)

I actually never tried with a lower batch size. The main issue is using a low batch size in the 15-1 increasing the non i.i.d.-ness of data (you may try to use Batch Renormalization in place of BN as in my https://arxiv.org/abs/2012.01415, but it may alter - in positive - the results).

As a rule of thumb, you may double the iteration and halve the learning rate but I won't guarantee it will work.

Nayoung Ko · Answer 4 · Wed Nov 01 2023 20:06:42 GMT+0800 (China Standard Time)

I actually never tried with a lower batch size. The main issue is using a low batch size in the 15-1 increasing the non i.i.d.-ness of data (you may try to use Batch Renormalization in place of BN as in my https://arxiv.org/abs/2012.01415, but it may alter - in positive - the results).

As a rule of thumb, you may double the iteration and halve the learning rate but I won't guarantee it will work.

Thank you for sharing~!
I will try with your recommendation.