real-stanford / scalingup

[CoRL 2023] This repository contains data generation and training code for Scaling Up & Distilling Down

Home Page:https://www.cs.columbia.edu/~huy/scalingup/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error: No available node types can fulfill resource request {'CPU': 1.0, 'GPU': 1.0}. Add suitable node types to this cluster to resolve this issue.

yellow07200 opened this issue · comments

Hi,

Thanks for your amazing work.
I try to train the network following your instruction:

python scalingup/train.py dataset_path=/home/yellow/scalingup/scalingup/wandb/run-20230810_101412-d96k833r/files evaluation=drawer algo=diffusion_default

But I met the error:

(autoscaler +32s) Error: No available node types can fulfill resource request {'CPU': 1.0, 'GPU': 1.0}. Add suitable node types to this cluster to resolve this issue.

And I checked the ray status:

======== Autoscaler status: 2023-08-11 10:30:05.114433 ========
Node status
Healthy:
1 node_5324e025d6f448e624493446922acea8142738cf15e5d890bfebf171
Pending:
(no pending nodes)
Recent failures:
(no failures)

Resources
Usage:
0.0/20.0 CPU
0B/6.32GiB memory
412.90MiB/3.16GiB object_store_memory

Demands:
{'CPU': 1.0, 'GPU': 1.0}: 1+ pending tasks/actors

Really appreciate on any suggestions!

Best regards,
Stella

I solve the issue by assigning CPU when initiate ray

ray.init(num_cpus=1, num_gpus=1)