ValueError: probabilities contain NaN in policy.py

Question

ValueError: probabilities contain NaN in policy.py

opened this issue 3 years ago · comments

Hey community,

I made an environment with openai gym and now I am trying different settings and agents.
I started with the agent from the dqn_cartpole example (https://github.com/wau/keras-rl2/blob/master/examples/dqn_cartpole.py). At some point the calculation of the q-values failed because of a NaN value. I added my Traceback and small changes in the settings below.

My settings in comparison to the dqn_cartpole example:
Dense Layer: instead of 16,16,16 i chose 256, 64, 16
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=50000, target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=500000, visualize=False, verbose=2)
• Last training episode before error: 497280/500000: episode: 2960, duration: 13.926s, episode steps: 168, steps per second: 12, episode reward: 47056.579, mean reward: 280.099 [-10229.000, 8998.000], mean action: 45.298 [0.000, 96.000], loss: 60564033920565248.000000, mae: 3245972224.000000, mean_q: 3358134016.000000

Traceback (most recent call last):

File "~environment.py", line 125, in
dqn.fit(env, nb_steps=500000, visualize=False, verbose=2)

File "~\python_env\lib\site-packages\rl\core.py", line 169, in fit
action = self.forward(observation)

File "~\python_env\lib\site-packages\rl\agents\dqn.py", line 227, in forward
action = self.policy.select_action(q_values=q_values)

File "~\python_env\lib\site-packages\rl\policy.py", line 227, in select_action
action = np.random.choice(range(nb_actions), p=probs)

File "mtrand.pyx", line 928, in numpy.random.mtrand.RandomState.choice

ValueError: probabilities contain NaN

I do not get this error, when I am using EpsGreedyQPolicy. Is there any possibility to understand why NaNs are produced and how to avoid them?

Kind regards, Jonas

Ege Hoşgüngör · Answer 1 · Fri Aug 27 2021 15:24:02 GMT+0800 (China Standard Time)

I have the same problem

action = np.random.choice(
File "mtrand.pyx", line 928, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities contain NaN

Any update on this issue ?

jocelynbaduria · Answer 2 · Mon Nov 22 2021 10:49:00 GMT+0800 (China Standard Time)

same here i am also getting this error.

action = np.random.choice(self.action_size, 1, p=prob)[0]
File "mtrand.pyx", line 928, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities contain NaN.

Appreciate some help.. thanks

jocelynbaduria · Answer 3 · Thu Nov 25 2021 03:24:30 GMT+0800 (China Standard Time)

same here i am also getting this error.

action = np.random.choice(self.action_size, 1, p=prob)[0] File "mtrand.pyx", line 928, in numpy.random.mtrand.RandomState.choice ValueError: probabilities contain NaN.

Appreciate some help.. thanks

I was able to fix this through checking my dataset that contains zeroes value.

stale · Answer 4 · Sun Apr 17 2022 01:45:06 GMT+0800 (China Standard Time)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Aldair Cuarez · Answer 5 · Wed Jun 21 2023 05:22:04 GMT+0800 (China Standard Time)

This is still an issue with keras-rl2 :(