deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.

Home Page:https://www.manning.com/books/deep-learning-with-pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

weights.unsqueeze(-1).unsqueeze_(-1) vs weights.unsqueeze(-1).unsqueeze(-1)

rasbt opened this issue · comments

On page 47, you are using

weights.unsqueeze(-1).unsqueeze_(-1)

I am wondering why you are using the _ in the latter unsqueeze call. I.e., why is it written as shown above and not

weights.unsqueeze(-1).unsqueeze(-1)

Hi Sebastian,

thank you for the comment!
So there are three parts to this

  • There is a philosophy that you'd reuse tensors when it is possible. So the first is out-of-place to keep the weights unchanged, the second just modifies the result of the first unsqueeze inplace instead creating a new tensor and discarding the old one. If you're in a really tight loop, that might make sense, but
  • that philosophy probably is way over the top here, and unsqeeze would be just as well (and better if it requires less thinking by people looking at the code).
  • Personally, I have grown a habit of using None indexing most of the time and think that is most concise. Given we have fixed dimensions (1d to 3d) here, unsqueezed_weights = weights[:, None, None] could be used (one could also use weights[..., None, None] to be strictly equivalent. My feeling is that this is also being more explicit about dimensions, so it seems good.

Best regards

Thomas

Thanks for the detailed explanation, Thomas. The in-place modification makes sense from a best-practice perspective.

Thank you for taking up the issue!