microsoft / nlp-recipes

Natural Language Processing Best Practices & Examples

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ASK] How to use GPU which is idle, instead of GPUs with IDs 0, 1, and etc.?

dunalduck0 opened this issue · comments

commented

Description

My machine has 8 GPUs, and I wanted to split them into two training jobs. Is it possible? I found the parameter to specify how many GPUs to use, but not their IDs. Therefore, each job competes to use the GPU with the ID 0, 1, 2, instead of the ones that are idle. Please advise.

Other Comments

This can be done here:
https://github.com/microsoft/nlp-recipes/blob/staging/utils_nlp/models/transformers/common.py#L165

Currently it's not exposed to the user, but that's a good point. We should fix this and allow passing specific IDs as well.

commented

I found one can specify GPUs to a job via CUDA_VISIBLE_DEVICES, though it's not the perfect solution.

pytorch/pytorch#20606

You can now pass gpu_ids #529