Invalid gpu argument

Question

Invalid gpu argument

yu-rp opened this issue 2 years ago · comments

Dear author,

I am running the pokemon_finetune.ipynb with the following setting.

# 2xA6000:
BATCH_SIZE = 4
N_GPUS = 1
ACCUMULATE_BATCHES = 1

gpu_list = ",".join((str(x) for x in range(N_GPUS))) + ","
print(f"Using GPUs: {gpu_list}")

I run the python main.py code block

# Run training
!(python main.py \
    -t \
    --base configs/stable-diffusion/pokemon.yaml \
    --gpus "$gpu_list" \
    --scale_lr False \
    --num_nodes 1 \
    --check_val_every_n_epoch 10 \
    --finetune_from "$ckpt_path" \
    data.params.batch_size="$BATCH_SIZE" \
    lightning.trainer.accumulate_grad_batches="$ACCUMULATE_BATCHES" \
    data.params.validation.params.n_gpus="$NUM_GPUS" \
)

I got an error saying that

main.py: error: argument --gpus: invalid _gpus_allowed_type value: ''

Could you please let me know why?

Raphaël Merx · Answer 1 · Sat Nov 12 2022 16:57:36 GMT+0800 (China Standard Time)

try passing it as an argument directly

--gpus 0, \

Chanchana Sornsoontorn · Answer 2 · Sat Dec 31 2022 23:13:56 GMT+0800 (China Standard Time)

try passing it as an argument directly
--gpus 0, \

Why does this work? Is there an intuitive explanation?
How can we make it variable?

Chanchana Sornsoontorn · Answer 3 · Sun Jan 01 2023 00:37:25 GMT+0800 (China Standard Time)

I'm not sure why the above hack somewhat works.
But I now know the true culprit. It's the typo made by pokemon finetune code.
$NUM_GPUS in the 2nd cell should be $N_GPUS instead.