VPG not working after #294
Devanshu24 opened this issue · comments
After the refactoring that was done, I thought changing the tutorial accordingly to this
import gym # OpenAI Gym
from genrl.agents import VPG
from genrl.trainers import OnPolicyTrainer
from genrl.environments import VectorEnv
env = VectorEnv("CartPole-v1")
agent = VPG('mlp', env)
trainer = OnPolicyTrainer(agent, env, epochs=200)
trainer.train()
Upon running this code I get the following error
Traceback (most recent call last):
File "/home/think__tech/Desktop/genrl/temp.py", line 10, in <module>
trainer.train()
File "/home/think__tech/Desktop/genrl/genrl/trainers/onpolicy.py", line 46, in train
self.agent.get_traj_loss(values, done)
File "/home/think__tech/Desktop/genrl/genrl/agents/deep/vpg/vpg.py", line 114, in get_traj_loss
self.rollout.compute_returns_and_advantage(values.detach().cpu().numpy(), dones)
File "/home/think__tech/Desktop/genrl/genrl/core/rollout_storage.py", line 233, in compute_returns_and_advantage
last_return = self.rewards[step] + next_non_terminal * next_value
TypeError: mul(): argument 'other' (position 1) must be Tensor, not numpy.ndarray
The reason I found was this
genrl/genrl/agents/deep/vpg/vpg.py
Line 114 in a1f0f73
Here we are passing the values as np array
But here
genrl/genrl/core/rollout_storage.py
Line 190 in a1f0f73
It expects a Torch tensor
Hence changing values.detach().cpu().numpy()
to values
does solve the problem
I don't, however, know where is torch being explicitly used in this (compute_returns_and_advantage) function
Also on a side note, I was just checking the CI file for VPG, it has this very function still the CI did pass on that merge
Why is that the case?
UPD: Come to think of it, something could be wrong with my local machine? I did reinstall the library though
After the refactoring that was done, I thought changing the tutorial accordingly to this
import gym # OpenAI Gym from genrl.agents import VPG from genrl.trainers import OnPolicyTrainer from genrl.environments import VectorEnv env = VectorEnv("CartPole-v1") agent = VPG('mlp', env) trainer = OnPolicyTrainer(agent, env, epochs=200) trainer.train()
This runs for me, gives the following expected output:
timestep Epoch loss mean_reward
0 0 7.5798 19.50480079650879
10240 10 10.2152 25.315200805664062
20480 20 12.8218 34.076499938964844
30720 30 15.2794 44.61869812011719
40960 40 18.8412 59.5349006652832
51200 50 21.3708 74.2029037475586
61440 60 24.8012 97.06159973144531
71680 70 26.8724 114.41339874267578
81920 80 30.5441 144.22540283203125
92160 90 32.3745 166.5041046142578
102400 100 34.4109 186.18179321289062
112640 110 35.6278 206.8686981201172
122880 120 36.009 222.60870361328125
133120 130 35.3222 220.21499633789062
143360 140 36.9469 249.756103515625
153600 150 36.5153 240.94119262695312
163840 160 38.5735 292.5714111328125
174080 170 36.6223 265.9739990234375
184320 180 39.6573 330.3226013183594
194560 190 39.0522 325.07940673828125
UPD: Come to think of it, something could be wrong with my local machine? I did reinstall the library though
Are you installing from source?
UPD: Come to think of it, something could be wrong with my local machine? I did reinstall the library though
Are you installing from source?
Yes, to be precise this
sudo python setup.py install
sudo python setup.py install
After this, do you run python
or sudo python
?
Also just to careful can you try python3
/sudo python3
?
Steps:
python setup.py install
does not work apparently insufficient permissionssudo python setup.py install
installs the dependencies and thingspython temp.py
orsudo python temp.py
orpython3 temp.py
orsudo python3 temp.py
- I get the error
Just python temp.py
used to work before
OS: Ubuntu 18.04
UPD: I don't think it would, but is a reboot necessary?
Can you try running this and check if the path of the library is where your source installation is?
import genrl
genrl.__file__
Can you try running this and check if the path of the library is where your source installation is?
import genrl genrl.__file__
Yes it is where the git repo is (If that's what you mean by source installation, couldn't understand clearly)
genrl/genrl/core/rollout_storage.py
Line 190 in a1f0f73
It expects a Torch tensor
Hence changing
values.detach().cpu().numpy()
tovalues
does solve the problem
This is still a bug though, feel free to raise a PR for this.
I might be doing some very silly mistake too, so if you're certain that the code is fine we can close this issue :)
It should be values.detach().clone()
just to be careful though.
Once you make the change can you try training it?
genrl/genrl/core/rollout_storage.py
Line 190 in a1f0f73
It expects a Torch tensor
Hence changingvalues.detach().cpu().numpy()
tovalues
does solve the problemThis is still a bug though, feel free to raise a PR for this.
Also I actually tested after making this change and it does work
I might be doing some very silly mistake too, so if you're certain that the code is fine we can close this issue :)
Regardless of the fact that I'm able to reproduce or not, compute_returns function should not receive a numpy array after #294
So feel free to raise a PR.
It should be
values.detach().clone()
just to be careful though.Once you make the change can you try training it?
Yup works after this too
Just out of curiosity any reason it passed the CI and worked for you but not for me?
Not sure. You could try and debug that though :)
Wanted to give an update
Earlier I was using torch 1.3.0
So I manually updated to the latest 1.6.0 and now it works (even the old code with i.e. values.detach().cpu().numpy()
)
I was operating under the assumption that doing python setup.py install
makes sure the correct versions are present, is that not true or was it a glitch
requirements.txt
does specify 1.4.0
requirements.txt
does specify1.4.0
Exactly, so when I did python setup.py install
it should have checked/updated that right? Or does that not happen?