BUG: RuntimeError: Triton Error [CUDA]: device kernel image is invalid and test unit get 4 errors
qinzhenyi1314 opened this issue · comments
OS:18.04.1
nvidia driver 470.57.02
gpu:A5000
python 3.10
cuda 11.8
triton 2.3.1
python3 -m pytest python/test/unit also get error
Upgrading to triton-nightly-3.0.0.post20240424212437 also shows error.
Error report when my llm running based llama3
Upgrading to triton-nightly-3.0.0.post20240424212437 my llm running based llama3
It looks like you missed some dependencies. You need to install them first:
pip install scipy numpy torch pytest lit pandas matplotlib && pip install -e python
the error in my llm project
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/119df232ab9fca4a1be87f95c239d7b9a765032e/util.py", line 469, in forward
q = apply_rotary_emb_func(
File "/root/.cache/huggingface/modules/transformers_modules/119df232ab9fca4a1be87f95c239d7b9a765032e/util.py", line 329, in apply_rotary_emb
return ApplyRotaryEmb.apply(
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/root/.cache/huggingface/modules/transformers_modules/119df232ab9fca4a1be87f95c239d7b9a765032e/util.py", line 255, in forward
out = apply_rotary(
File "/root/.cache/huggingface/modules/transformers_modules/119df232ab9fca4a1be87f95c239d7b9a765032e/util.py", line 212, in apply_rotary
rotary_kernel[grid](
File "/opt/conda/lib/python3.10/site-packages/triton/runtime/jit.py", line 167, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/triton/runtime/jit.py", line 425, in run
kernel.run(grid_0, grid_1, grid_2, kernel.num_warps, kernel.num_ctas, # number of warps/ctas per instance
File "/opt/conda/lib/python3.10/site-packages/triton/compiler/compiler.py", line 255, in getattribute
self._init_handles()
File "/opt/conda/lib/python3.10/site-packages/triton/compiler/compiler.py", line 250, in _init_handles
self.module, self.function, self.n_regs, self.n_spills = driver.utils.load_binary(
RuntimeError: Triton Error [CUDA]: device kernel image is invalid
请问您解决这个问题了么?我在本地端3090Ti上运行的时候也出现了这个问题