Can not start without fix train.py
gertelrina opened this issue · comments
Have bug, described bellow:
File "../trivialaugment/TrivialAugment/train.py", line 353, in spawn_process
assert worldsize == C.get()['gpus'], f"Did not specify the number of GPUs in Config with which it was started: {worldsize} vs {C.get()['gpus']}"
It happens due to incorrect unpacking "args.config". need to use "args.config[0]" instead "args.config" (line: 343-344).
After this it work right :)
Hi @gertelrina
Thanks for the shoutout. Could you please let me know what the exact diff of your change is? If you have it lying around still, I would be very happy about a stack trace as well. :)
I also encountered this bug.
File " .../trivialaugnent-naster/TrivialAugnent/train.py" , line 348,in spawn_process
assert worldsize == C.get()[ 'gpus '],f"Did not specify the nunber of CPus inConfig with which it was started: {worldsize} vs {C.get()[ 'gpus']]”
File "/root/ENTER/envs/trivial/lib/python3.8/site-packages/theconf-0.1.7-py3.8.egg/theconf/config.py" , line 126, in__getitem__
_return self.conf[ key]
KeyError : 'gpus'
I haven't solved it yet, looking forward to the author's answer. Thanks!
Just as a quick follow up, in your config file there is a gpus
key? Just to be sure. I am trying to reproduce this right now on my end.
I just gave it a try installing all dependencies anew and cloning this repo and the training command from the README did not throw the error you had. So maybe try this command just like it is in the readme and re-install the dependencies. :)