Fix randomness in gradients
adrhill opened this issue · comments
Zygote's gradient
appears to be non-deterministic on Metalhead's VGG19:
julia> a = gradient((in) -> model(in)[1], imgp)[1];
julia> b = gradient((in) -> model(in)[1], imgp)[1];
julia> isapprox(a, b; atol=1e-3)
false
julia> isapprox(a, b; atol=1e-2)
true
Check whether this is due to Dropout layers or Zygote.
Flux's default behaviour appears to put the layer in train-mode when computing gradients:
julia> Flux.testmode!(model);
julia> a = gradient((in) -> model(in)[1], img)[1];
julia> b = gradient((in) -> model(in)[1], img)[1];
julia> a ≈ b
true
More here:
Many normalisation layers behave differently under training and inference (testing). By default, Flux will automatically determine when a layer evaluation is part of training or inference.
Closed by #11.