How to draw flatness curve in Figure 3?
FrankZhangRp opened this issue · comments
Hi,
Thank you so much for providing this repo, the work is awesome!
And how can we reproduce the loss gap curve in Figure 3 of this paper? How to add the gamma on the model parameter and what is the metric of the distance in X-axis? I flat the model parameter dict into one vector and add a noise vector with norm 1.0 and get the loss gap about 0.2 on p domain test, I must have made a mistake on the Monte-Carlo approximation sampling.
Thanks a lot!
Hi, thanks to the interest in our study.
We first sample an unit direction vector and compute the loss gap by changing the model parameter according to the radius gamma. The parameter difference can be computed by gamma * unit_direction_vector
. The reported value is averaged over 100 sampled direction vectors. X-axis indicates the gamma.
Simple pytorch-style pseudo code is:
n_params = num_parameters(model)
direction_vector = torch.randn(n_params)
unit_direction_vector = direction_vector / torch.norm(direction_vector)
for gamma in gamma_list:
noised_model = get_noised_model(model, unit_direction_vector * gamma)
loss_gap = evaluate(noised_model) - evaluate(model)
got it! Very clear! Thanks a lot!
got it! Very clear! Thanks a lot!
The loss gap I get seems to be wrong. Did you solve this problem?
- Three converged models are used. In particular, the models are converged before 1000 steps (See Fig. 5), and models from 2500, 3500, 4500 steps are used.
- Yes.