triton-lang / triton

Development repository for the Triton language and compiler

Home Page:https://triton-lang.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot specify which device to use

AndrejHafner opened this issue · comments

Hello.

I ran into an issue trying to compile a pytorch model. I have a node with multiple GPUs and I was using device cuda:1. There was nothing running on the GPU and the VRAM was not utilized. My model failed to compile due to a CUDA OOM error and threw an error on this line.

if fast_flush:
cache = torch.empty(int(cache_size // 4), dtype=torch.int, device='cuda')
else:
cache = torch.empty(int(cache_size), dtype=torch.int8, device='cuda')

It looks like the code does not use the device on which the model is, but always uses cuda, which defaults to cuda:0 device. In my case, devive cuda:0 was being used and its memory was totally full, which led to an OOM error.

It would be good if the testing code here used the device used by the model or you could specify it in a way.

Best regards,
Andrej

use torch.cuda.set_device to change the default cuda device or parse 'cuda:1' instead of cuda

use torch.cuda.set_device to change the default cuda device or parse 'cuda:1' instead of cuda