tsunghan-wu / RandLA-Net-pytorch

:four_leaf_clover: Pytorch Implementation of RandLA-Net (https://arxiv.org/abs/1911.11236)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Excessive memory requirements

huixiancheng opened this issue · comments

Hi, I would like to know how much memory you need for testing SemanticKITTI. When setting batch=1, I need almost 32G of memory (not GPU memory). Is this normal? Or is there any way to reduce that demand?

hi@huixiancheng,if you test succsesfully?i got some problems when i was testing ,i can't do the testing process on sequences13,19,and 21,however other sequences are successed. could you give me some advices?

May caused by out of memory. A simple way to solve this is just use Slice in here.
https://github.com/tsunghan-mama/RandLA-Net-pytorch/blob/913837e846176e4247a7e21783bf8f2f38576257/dataset/semkitti_testset.py#L26

Such as 4071 in seq 08. Just infer two time. Rough but effective and not impact on accuracy in my tests
Once is
self.data_list = sorted(self.data_list)[0:3000].
Then ifer again in
self.data_list = sorted(self.data_list)[3000:]

I haven't used the original code so I can't give advice.
Also, all you need to be aware of is the error log given by codalab.
May be you can try to get help in here.

what is your environments,i want to try run your code.

Just this repo with infer in "all" type.

I did not submit a test, I think if there is no problem with this api verification in valid set, the test is also no problem.

commented

hi, @huixiancheng,i have run data_prepare_semantickitti.py successfully, but when i train the model it was wrong, the error is: RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19], how can i do?

Hi, I do not meet this problem. Maybe You should check the number of classes and classes_weights.

Here is the weight I ever caculated and used.

class_weights = torch.tensor([[17.1775, 49.4507, 49.0822, 45.9186, 44.9319, 49.0655, 49.6848, 49.8644,
5.3651, 31.3473, 7.2697, 41.0090, 5.5935, 11.1401, 2.8727, 37.3551,
9.1705, 43.3172, 48.0677]]).cuda()

It really a tensor of shape: torch.Size([1, 19]).

commented

@huixiancheng, thank you very much for your data and advice, i try it but still can not work. Do you think maybe this problem has relation with checkpoint.rar? because i can't gei it from your link in readme.md. it was empty.

No. I think it will not effect.

commented

@huixiancheng i am very grateful for you give me advices, i will try it again, thank you very much

@xlr-project Maybe you use torch=1.10? I just reprodece your errors with this setting(torch=1.10 with cuda=11.3 ). When change to torch=1.81 and cuda=11.1. It work well.

commented

@xlr-project Maybe you use torch=1.10? I just reprodece your errors with this setting(torch=1.10 with cuda=11.3 ). When change to torch=1.81 and cuda=11.1. It work well.

thank you very much, i make it successfully already