hpcaitech / ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ModuleNotFoundError: No module named 'colossal_layer_norm_cuda'

200987299 opened this issue · comments

commented

Dear developers,

I am trying to run the bert example but I got this error, any hint to fix it?

Thanks

Hi, this is because you CUDA extension is not built. Can you make sure your PyTorch uses the same cuda version as your cuda runtime. One way is to add the -v flag to your pip install command. If cuda extension is not built, the log will tell you the info about the version mismatch.

commented

Hi, this is because you CUDA extension is not built. Can you make sure your PyTorch uses the same cuda version as your cuda runtime. One way is to add the -v flag to your pip install command. If cuda extension is not built, the log will tell you the info about the version mismatch.

Hi, could you be more specific? I tried few times reinstalling all the stuff, but I failed every time. The pip -v install shown everything was fine.

Thanks

Hi @200987299 We have updated and simplified the installation process, and the example code has been updated. You can try to reinstall, thanks.