INTERNAL ASSERT FAILED
Qicheng-WANG opened this issue · comments
Hi there,
When I ran a quick test "python3 -m tutel.examples.helloworld --batch_size=16", it showed error as follow:
RuntimeError: (true) == (fp != nullptr)INTERNAL ASSERT FAILED at "/ssdisk2/tutel/tutel/custom/custom_kernel.cpp":46, please report a bug to PyTorch. CHECK_EQ fails.
Could you help me fix it?Thanks
- Does
print(torch.cuda.get_arch_list())
includesm_86
? - Can you try
export USE_NVRTC=1
before running the example? - Are you sure there is no other old CUDA installed so that an old nvcc command was wrongly called for this compilation?
- Does
print(torch.cuda.get_arch_list())
includesm_86
?- Can you try
export USE_NVRTC=1
before running the example?- Are you sure there is no other old CUDA installed so that an old nvcc command was wrongly called for this compilation?
Hi! I am running tutel in jetson nano b01 (4GB version)
I also meet problem "RuntimeError: (true) == (fp != nullptr)INTERNAL ASSERT FAILED at "/ssdisk2/tutel/tutel/custom/custom_kernel.cpp".
In the nano computer,
1.print(torch.cuda.get_arch_list()
is ['sm_53', 'sm_62', 'sm72']
2. I use export USE_NVRTC=1, but another error occurred.
3. My nvcc version is 10.2.3
This is the problem from Pytorch + CUDA not tutel. You need a pytorch built with at least cu117/118 so that torch.cuda.get_arch_list() should include sm_86
.
You also need to update your CUDA SDK (e.g. to 12.0) since NVDIA's new GPU is not compatible with its older NVCC SDK.
CUDA 10.2.3 is too old and it cannot support any new GPU that is above V100 (sm_7x). CUDA 11 should support A100 related types and CUDA 12 should support H100 related types. After upgrading CUDA SDK, please also reinstall pytorch that is built upon at least cu118.