feidfoe / learning-not-to-learn

[CVPR2019]Learning Not to Learn : An adversarial method to train deep neural networks with biased data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How did you choose _lambda = 0.01?

ricvolpi opened this issue · comments

Hi - thank you for releasing the code for your paper.

I was playing with your code because my colleagues and I are comparing a method we designed with yours. I was wondering how you came up with the hyper-parameter choice _lambda = 0.01 (trainer.py, line 107). I couldn't find the discussion around the hyper-parameter selection in your paper (my apologies if I have missed it).

Best,
Riccardo

Basically it's rule of thumb:)

Fortunately, algorithm was not very sensitive to the choice of lambda on our exp.

Very sorry for too late response.
Byungju Kim

Hi Byungju,

Many thanks for your reply. What is the rule of thumb though? :)

I have tried several values for lambda, and the model seems actually pretty sensitive to me. Also, I could not understand how published results were obtained, since I could not replicate them by running the released code as is.

Are they all related to the same lambda value? Is any form or early stopping involved?

Below the final outputs with different combinations of benchmark/lambda value/random seed.

Thanks in advance,
Riccardo


[unlearn_0.02_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7118 (7118/10000)
[unlearn_0.02_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.1333 (1333/10000)
[unlearn_0.02_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7336 (7336/10000) <- best
[unlearn_0.02_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.5819 (5819/10000)
[unlearn_0.02_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.4728 (4728/10000)

[unlearn_0.025_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7555 (7555/10000)
[unlearn_0.025_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7069 (7069/10000)
[unlearn_0.025_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6885 (6885/10000)
[unlearn_0.025_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8289 (8289/10000) <- best
[unlearn_0.025_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6464 (6464/10000)

[unlearn_0.03_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8516 (8516/10000) <- best
[unlearn_0.03_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7695 (7695/10000)
[unlearn_0.03_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7982 (7982/10000)
[unlearn_0.03_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7383 (7383/10000)
[unlearn_0.03_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6113 (6113/10000)

[unlearn_0.035_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8506 (8506/10000)
[unlearn_0.035_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8781 (8781/10000) <- best
[unlearn_0.035_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8514 (8514/10000)
[unlearn_0.035_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7488 (7488/10000)
[unlearn_0.035_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7396 (7396/10000)

[unlearn_0.04_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8655 (8655/10000)
[unlearn_0.04_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8747 (8747/10000) <- best
[unlearn_0.04_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8689 (8689/10000)
[unlearn_0.04_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8130 (8130/10000)
[unlearn_0.04_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7493 (7493/10000)

[unlearn_0.045_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9277 (9277/10000) <- best
[unlearn_0.045_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9277 (9277/10000) <- best
[unlearn_0.045_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9017 (9017/10000)
[unlearn_0.045_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8775 (8775/10000)
[unlearn_0.045_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8682 (8682/10000)

[unlearn_0.05_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9219 (9219/10000)
[unlearn_0.05_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9334 (9334/10000)
[unlearn_0.05_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9429 (9429/10000) <- best
[unlearn_0.05_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8914 (8914/10000)
[unlearn_0.05_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8712 (8712/10000)

Hi, Riccardo.

Try pre-training.
Train your network without h network before the adversarial process.

When we tried this method, the overall performance had been improved a bit.
If it does not work, there may be some bugs in the released version of our code.

We'll try to find it.
Thanks.

Byungju Kim