torch / DEPRECEATED-torch7-distro

Torch7: state-of-the-art machine learning algorithms

Home Page:www.torch.ch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nn.Linear:reset(std) has a surprise for the user

jucor opened this issue · comments

Hi guys

nn.Linear:reset(std) has a surprise for the user. If the argument std is specified, the first thing reset does is multiply it by sqrt(3), before then using it as the bound of the uniform distribution on the weights.

Would it be possible to either:
a/ document it.
b/ use "least surprise" as a principle :-)

I do realize that reading the whole code of the function makes it clear. However, having to do so for every function, as opposed to read the declaration and scan for the final place where it is used, makes for very defiant-coding.

The standard deviation of a uniform distribution over [-a,a] is std=a/sqrt(3).
Thus not only the code is correct, but it behaves as expected.

Can you at least please write it in the documentation?

Suggestion: I think I got confused by describing a uniform in terms of it standard deviation, as opposed of the bounds of its support.
Would you accept a pull-request that
1/ adds a second parameter to reset(), and would parametrize as the interval if both parameters are given, or as the standard deviation (current behaviour) if only one parameter is given
2/ add the description of the two behaviours to the documentation ?
I'd be happy to do that.

i think this is overkill. i chose the uniform distribution. it could have been something else, like a normal.
in any case it is easy to override the reset function by your own, if you want some custom initialization.