fcdl94 / WILSON

Official implementation of "Incremental Learning in Semantic Segmentation from Image Labels"

Home Page:https://arxiv.org/abs/2112.01882

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: Class values must be smaller than num_classes.

adaxidedakaonang opened this issue · comments

Hi, when I try to run the code with single GPU (I deleted the distributed-related code), I got the following error, it's very strange:

Epoch 1, lr = 0.009699: 100%|█| 154/154 [02:29<00:00,  1.03it/s, loss=0.41
13 INFO:0: Epoch 1, Class Loss=0.4461328089237213, Reg Loss=0.0
14 INFO:0: End of Epoch 1/30, Average Loss=0.4461328089237213, Class Loss=0.4461328089237213, Reg Loss=0.0
15 INFO:0: End of Validation 1/30
16 INFO:0: Epoch 2, lr = 0.009398
17 Epoch 2, lr = 0.009398: 100%|█| 154/154 [02:31<00:00,  1.02it/s, loss=0.33
18 INFO:0: Epoch 2, Class Loss=0.3124752342700958, Reg Loss=0.0
19 INFO:0: End of Epoch 2/30, Average Loss=0.3124752342700958, Class Loss=0.3124752342700958, Reg Loss=0.0
20 INFO:0: End of Validation 2/30
21 INFO:0: Epoch 3, lr = 0.009095
22 Epoch 3, lr = 0.009095:  31%|▎| 48/154 [00:47<01:46,  1.01s/it, loss=0.233Traceback (most recent call last):
23   File "run.py", line 227, in <module>
24     main(opts)
25   File "run.py", line 123, in main
26     epoch_loss = trainer.train(cur_epoch=cur_epoch, train_loader=train_loader)
27   File "/home/lttm/Desktop/chang/WILSON-main/train.py", line 194, in train
28     loss = criterion(outputs, labels)  # B x H x W
29   File "/home/lttm/miniconda3/envs/pytorch1.8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
30     result = self.forward(*input, **kwargs)
31   File "/home/lttm/Desktop/chang/WILSON-main/utils/loss.py", line 73, in forward
32     targets = F.one_hot(labels_new, inputs.shape[1] + 1).float().permute(0, 3, 1, 2)
33 RuntimeError: Class values must be smaller than num_classes.

And I can't wake up my PC. Does anyone meet this strange problem?

Hey @adaxidedakaonang.
The error states that your labels_new has a class index that is greater than inputs.shape[1] + 1.
Please, check if the number of classes is properly set and if the labels contain some mistaken value.

Grazie, seems it's the problem of the GPU instead of the code. Decreasing num_workers (from 4 to 2) and decreasing batch size solve the problem......