nn.Linear:reset(std) has a surprise for the user

Question

nn.Linear:reset(std) has a surprise for the user

jucor opened this issue 11 years ago · comments

Hi guys

nn.Linear:reset(std) has a surprise for the user. If the argument std is specified, the first thing reset does is multiply it by sqrt(3), before then using it as the bound of the uniform distribution on the weights.

Would it be possible to either:
a/ document it.
b/ use "least surprise" as a principle :-)

I do realize that reading the whole code of the function makes it clear. However, having to do so for every function, as opposed to read the declaration and scan for the final place where it is used, makes for very defiant-coding.

Ronan Collobert · Answer 1 · Sat Jul 27 2013 19:11:43 GMT+0800 (China Standard Time)

The standard deviation of a uniform distribution over [-a,a] is std=a/sqrt(3).
Thus not only the code is correct, but it behaves as expected.

Julien Cornebise · Answer 2 · Sat Jul 27 2013 19:13:30 GMT+0800 (China Standard Time)

Can you at least please write it in the documentation?

Julien Cornebise · Answer 3 · Sat Jul 27 2013 19:25:20 GMT+0800 (China Standard Time)

Suggestion: I think I got confused by describing a uniform in terms of it standard deviation, as opposed of the bounds of its support.
Would you accept a pull-request that
1/ adds a second parameter to reset(), and would parametrize as the interval if both parameters are given, or as the standard deviation (current behaviour) if only one parameter is given
2/ add the description of the two behaviours to the documentation ?
I'd be happy to do that.

Ronan Collobert · Answer 4 · Tue Jul 30 2013 01:32:17 GMT+0800 (China Standard Time)

i think this is overkill. i chose the uniform distribution. it could have been something else, like a normal.
in any case it is easy to override the reset function by your own, if you want some custom initialization.