WXinlong / SOLO

SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)

BIGWangYuDong opened this issue · comments

Hi,
I tried to run SOLOv1 on the slurm, by using
GPUS=8 GPUS_PER_TASK=8 ./tools/slurm_train.sh XX XX config/solo/solo_r50_1x.py

while validating, it got a CUDA error
CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
企业微信截图_16250431208763

This seems got an error at torch.mm, but I don't know how to fix this bug.
When I rerun the code, it won’t have this bug anymore.

I run the code under Pytorch 1.4. CUDA 9.0
Meanwhile, I rewrite your code under MMDetection v2.0+, with Pytorch 1.5, CUDA 9.0. And sometimes, I also get this bug.

Do you know how to fix this bug? or have any Ideas?

Yudong