A doubt at line 350 in dqn.cpp

Question

A doubt at line 350 in dqn.cpp

chiggum opened this issue 9 years ago · comments

Firstly, sorry I posted this as an issue as I wasn't able to find your mail.

Doubt: I am not able to understand whether the solver takes a million steps at solver_->step(1) as mentioned in dqn_solver.txt or only a single step.
According to me 1 step == forward propagate, calculate errors, backpropagate errors and update filter weights.

Can you please clarify this?
Sorry I wasn't able to get this part of caffe documentation.

Thanks!
Regards.

Yasuhiro Fujita · Answer 1 · Sat Feb 07 2015 17:34:39 GMT+0800 (China Standard Time)

No problem. Feel free to ask any question here.

solver_->Step(1) is a single step, an update from a minibatch of 32 transitions.

Dhruv Kohli · Answer 2 · Sat Feb 07 2015 17:57:24 GMT+0800 (China Standard Time)

Thanks a lot! I am also training an agent (rom: breakout) but facing quite a lot of difficulties as I am not using caffe but cudnn library. I'll then be asking you, my further queries here only. I hope to get some help.

Thanks a lot again.
Regards.

Dhruv Kohli · Answer 3 · Sun Feb 08 2015 14:04:01 GMT+0800 (China Standard Time)

Hi again,
As I said earlier, I am using breakout as the rom. I have written some code and it's working fine but unfortunately it's not learning as the slider in the game(the agent) either gets stuck to the left end of the screen or the right end.

After a lot of struggle, I am still unable to understand the reason for that. Do you have any clue why's that happening or can you have a look at my code please. I have added you as a collaborator in my private repo.

Thanks!
Regards.

Yasuhiro Fujita · Answer 4 · Sun Feb 08 2015 20:42:44 GMT+0800 (China Standard Time)

I've not read your code yet, but the discussion of this thread can be of some help
https://groups.google.com/forum/#!topic/deep-q-learning/k1CzQxOi_v4

Dhruv Kohli · Answer 5 · Mon Feb 09 2015 01:45:29 GMT+0800 (China Standard Time)

Thanks a lot for that link! I think I can make it up from here.