Weird OOM in frnn_gather

Question

Weird OOM in frnn_gather

yuhao opened this issue 3 years ago · comments

i am using 10M points and have a GPU memory of 11GB. The search works fine, but in points2_nn = frnn_gather(points2, idxs, lengths2) I got an OOM error.

Traceback (most recent call last):
  File "nd.py", line 131, in <module>
    validator.launch()
  File "nd.py", line 120, in launch
    dists_frnn, idxs_frnn, nn_frnn = self.frnn_grid()
  File "nd.py", line 74, in frnn_grid
    return_sorted=True)
  File "/home/vax10/u4/xx/venv/lib64/python3.6/site-packages/frnn-0.0.0-py3.6-linux-x86_64.egg/frnn/frnn.py", line 377, in frnn_grid_points
    points2_nn = frnn_gather(points2, idxs, lengths2)
  File "/home/vax10/u4/xx/venv/lib64/python3.6/site-packages/frnn-0.0.0-py3.6-linux-x86_64.egg/frnn/frnn.py", line 419, in frnn_gather
    tmp_idxs = idxs.clone().detach()
RuntimeError: CUDA out of memory. Tried to allocate 4.64 GiB (GPU 1; 10.76 GiB total capacity; 7.77 GiB already allocated; 1.84 GiB free; 7.80 GiB reserved in total by PyTorch)

Lixin Xue · Answer 1 · Sat Jul 03 2021 01:02:56 GMT+0800 (China Standard Time)

Could you give me a minimal example to reproduce this memory error so that I can check which parts are too large? If the K or D is too large, it could be the case that a points2_nn of size (N, P, K, D) is just too large for the GPU.

Yuhao Zhu · Answer 2 · Sat Jul 03 2021 01:17:32 GMT+0800 (China Standard Time)

D is 3, K is 50, and my input point cloud has about 10M points. the OOM indeed happens inside the `frnn_gather` function, and i think yes it's because `points2_nn` is too big. i guess for 10M points and K of 50, points2_nn will have 500M neighbors, each of which is a 3D point, then it's 500M * 3 * 4B, which is 6GB. is this calculation correct? but this is done completely on the CPU right? not sure why I got a GPU OOM... Yuhao

…

On Fri, Jul 2, 2021 at 1:03 PM Lixin Xue ***@***.***> wrote: Could you give me a minimal example to reproduce this memory error so that I can check which parts are too large? If the K or D is too large, it could be the case that a points2_nn of size (N, P, K, D) is just too large for the GPU. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHRSQPMD2BRZX7LDSCRWG3TVXWMZANCNFSM47XBID6Q> .

Lixin Xue · Answer 3 · Sat Jul 03 2021 01:54:18 GMT+0800 (China Standard Time)

Yeah, I think it is correct. My implementation should work for both GPU and CPU. So if the points, idxs, and the lengths are GPU tensors, this will result in a GPU OOM.