deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.

Home Page:https://www.manning.com/books/deep-learning-with-pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Misleading code [3.8.2]

OverLordGoldDragon opened this issue · comments

We can easily verify that the two tensors share the same storage
id(points.storage()) == id(points_t.storage())

is should be used, which returns False; try id(float(3)) == id(float(4)). See here.

Hi. You are absolutely right that there is a problem here, tank you!
The issue is that there isn't a 1-1 map between the C++ storage (which would be the same) and the Python Storage objects (which are recreated with every storage call). So if anything is True here, it is due to the incorrect use of ==.
Incidentally, as far as I know, the longer term idea seems to be that PyTorch doesn't use a storage class at all (but instead has views of a tensor), but it's not a priority.

@t-vi The main point that two tensors share the same Storage is correct, but it's also important to not confuse that with identical Python id's; I think the existing examples of editing a storage and affecting both tensors do well enough to explain.

Otherwise a fine textbook you've made available freely, thank you.

@t-vi Also misleading statement in 5.5.2:

image

This only holds for the baby example and completely breaks down for larger models, so to not underscore importance of input normalization maybe add "... blink (in this simple case):"

Well, so the key takeaway we wanted to have here is that Adam will automatically "equilibrate" the step size between the parameters while SGD will need you to do this manually. We could have made it clearer that it isn't universal that you can use this learning rate, but then we're deeply focusing on this example here.

@t-vi Done reading, ended at ch12. Likely typos:

  • 8.2.4: MaxPool3d -> MaxPool2d
  • 8.3.1: self.pool3 -> self.pool2
  • 9.4.4: code/p2ch09_explore_data.ipynb -> p2ch10?
  • 10.4.4: getRawNodule -> getRawCandidate
  • Listing 10.12: add # to 1((CO10-1)) and 1((CO10-2))? confused as part of code
  • 11.1: lets* you move forward

Some feedback on part 2: I'd say the data pipeline complexity was a bit overkill, namely on caching. There wasn't really a need when one can simply extract candidates/nodules into own files and be simpler and faster. Good to know how-to, but among all other things the reader is supposed to learn this may add redundant overhead. Otherwise, quality material - especially in part 1, gradually adding layers of complexity starting at w*x + b.

Heya, yeah. Between the three of us, we've had mixed feelings about the caching, too, and the main reason to have it is to provide the perspective that such thing exists. Obviously, some of us also have inclinations towards more targeted data tooling, e.g. hangar (video).
If you later read the remainder, I'd be keen to hear your feedback on the remainder, too, in particular ch14 and 15.

And thank you for your very thorough and critical reading and sharing your comments. It is very interesting and, indeed, helpful.