RuntimeError: CUDA error: an illegal memory access was encountered
shilpaullas4 opened this issue · comments
Hi @CoinCheung ,
I tried running your BiSeNetv2 implementation on ADE20K dataset and it works!
I would like to train custom dataset using bisenet-v2.
Dataset Details :
- image size : 1920x1080
- Only two values in my label image {0, 127}
Config :
cfg = dict(
model_type='bisenetv2',
n_cats=2,
num_aux_heads=4,
lr_start=5e-3,
weight_decay=5e-4,
warmup_iters=1000,
max_iter=29000,
dataset='CustomerDataset',
im_root='./datasets/custom/',
train_im_anns='./datasets/custom/train.txt',
val_im_anns='./datasets/custom/val.txt',
scales=[0.5, 2.],
cropsize=[512, 512],
eval_crop=[512, 512],
eval_start_shortside=512,
eval_scales=[0.5, 0.75, 1, 1.25, 1.5, 1.75],
ims_per_gpu=1,
eval_ims_per_gpu=1,
use_fp16=True,
use_sync_bn=True,
respth='./res',
)
But while trying with this dataset , I'm getting cuda runtime error as shown below
loss_hard = loss[loss > self.thresh]
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
How do I solve this issue?
@shilpaullas4
I recommender that you refer to this link.
I think the cause of the error is incorrect label matching.
Hi @Sangh0
Thanks for the reply.
I was able to resolve it by adding the following line in base_dataset.py
_, label = cv2.threshold(label, 0, 255, cv2.THRESH_BINARY)
def get_image(self, impth, lbpth):
img = cv2.imread(impth)[:, :, ::-1].copy()
label = cv2.imread(lbpth, 0)
_, label = cv2.threshold(label, 0, 255, cv2.THRESH_BINARY)