weights.unsqueeze(-1).unsqueeze_(-1) vs weights.unsqueeze(-1).unsqueeze(-1)

Question

rasbt opened this issue 4 years ago · comments

On page 47, you are using

weights.unsqueeze(-1).unsqueeze_(-1)

I am wondering why you are using the _ in the latter unsqueeze call. I.e., why is it written as shown above and not

weights.unsqueeze(-1).unsqueeze(-1)

Thomas Viehmann · Answer 1 · Fri Dec 25 2020 04:01:00 GMT+0800 (China Standard Time)

Hi Sebastian,

thank you for the comment!
So there are three parts to this

There is a philosophy that you'd reuse tensors when it is possible. So the first is out-of-place to keep the weights unchanged, the second just modifies the result of the first unsqueeze inplace instead creating a new tensor and discarding the old one. If you're in a really tight loop, that might make sense, but
that philosophy probably is way over the top here, and unsqeeze would be just as well (and better if it requires less thinking by people looking at the code).
Personally, I have grown a habit of using None indexing most of the time and think that is most concise. Given we have fixed dimensions (1d to 3d) here, unsqueezed_weights = weights[:, None, None] could be used (one could also use weights[..., None, None] to be strictly equivalent. My feeling is that this is also being more explicit about dimensions, so it seems good.

Best regards

Thomas

Sebastian Raschka · Answer 2 · Mon Dec 28 2020 03:15:11 GMT+0800 (China Standard Time)

Thanks for the detailed explanation, Thomas. The in-place modification makes sense from a best-practice perspective.

Prabhav Kaula · Answer 3 · Sat Oct 21 2023 06:03:30 GMT+0800 (China Standard Time)

Thank you for taking up the issue!