Distributed mode for single GPU
TheodorPatrickZ opened this issue · comments
Is it possibile to run itr_flickr as not distributed but on a single gpu?
When running:
python run.py --task "itr_flickr" --dist "gpu0" --output_dir "output/itr_flickr" --checkpoint "4m_base_finetune/itr_flickr/checkpoint_best.pth"
I get:
Training Retrieval Flickr
| distributed init (rank 0): env://
Traceback (most recent call last):
File "Retrieval.py", line 381, in
main(args, config)
File "Retrieval.py", line 215, in main
utils.init_distributed_mode(args)
File "C:\Users..\X-VLM-master\utils_init_.py", line 357, in init_distributed_mode
world_size=args.world_size, rank=args.rank)
File "C:\Users..\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\distributed\distributed_c10d.py", line 434, in init_process_group
init_method, rank, world_size, timeout=timeout
File "C:\Users..\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\distributed\rendezvous.py", line 82, in rendezvous
raise RuntimeError("No rendezvous handler for {}://".format(result.scheme))
RuntimeError: No rendezvous handler for env://
Hi,
Our code can run on a single gpu by specifying --dist "gpu0".
I didn't get this error and also have no idea. Sorry.
Got it running after looking at it again the next day, thanks for the fast response!