[ASK] How to use GPU which is idle, instead of GPUs with IDs 0, 1, and etc.?
dunalduck0 opened this issue · comments
Description
My machine has 8 GPUs, and I wanted to split them into two training jobs. Is it possible? I found the parameter to specify how many GPUs to use, but not their IDs. Therefore, each job competes to use the GPU with the ID 0, 1, 2, instead of the ones that are idle. Please advise.
Other Comments
This can be done here:
https://github.com/microsoft/nlp-recipes/blob/staging/utils_nlp/models/transformers/common.py#L165
Currently it's not exposed to the user, but that's a good point. We should fix this and allow passing specific IDs as well.
I found one can specify GPUs to a job via CUDA_VISIBLE_DEVICES, though it's not the perfect solution.
You can now pass gpu_ids #529