PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Hands-on Deep Reinforcement Learning, published by Packt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chapter 06 DQN pong training

moayad-hsn opened this issue · comments

Hi,
so I faced this error while running the code for training the DQN agent on pong
8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s
9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s
Traceback (most recent call last):
File "02_dqn_pong.py", line 169, in
loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

Hi,
so I faced this error while running the code for training the DQN agent on pong
8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s
9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s
Traceback (most recent call last):
File "02_dqn_pong.py", line 169, in
loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

There is no such big action, the correct action range is from 0 to env.action_space.n (which is 5 on Pong, totally 6 actions). So, I think you can check the array action_v. make sure that was the really action array you want to input to the method gather.

Hi, guys,

This error also appears when I use the CPU instead of the GPU.
If I use the GPU the error appears:

Traceback (most recent call last):
File "...Chapter06/02_dqn_pong.py", line 176, in
loss_t = calc_loss(batch, net, tgt_net, device=device)
File "...Chapter06/02_dqn_pong.py", line 97, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)
RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #3 'index' in call to _th_gather_out

If I use '.long()' the speed decreases massively. But the code runs.

And:
print(actions_v.shape) -> torch.Size([32])