Weird OOM in frnn_gather
yuhao opened this issue · comments
Yuhao Zhu commented
i am using 10M points and have a GPU memory of 11GB. The search works fine, but in points2_nn = frnn_gather(points2, idxs, lengths2)
I got an OOM error.
Traceback (most recent call last):
File "nd.py", line 131, in <module>
validator.launch()
File "nd.py", line 120, in launch
dists_frnn, idxs_frnn, nn_frnn = self.frnn_grid()
File "nd.py", line 74, in frnn_grid
return_sorted=True)
File "/home/vax10/u4/xx/venv/lib64/python3.6/site-packages/frnn-0.0.0-py3.6-linux-x86_64.egg/frnn/frnn.py", line 377, in frnn_grid_points
points2_nn = frnn_gather(points2, idxs, lengths2)
File "/home/vax10/u4/xx/venv/lib64/python3.6/site-packages/frnn-0.0.0-py3.6-linux-x86_64.egg/frnn/frnn.py", line 419, in frnn_gather
tmp_idxs = idxs.clone().detach()
RuntimeError: CUDA out of memory. Tried to allocate 4.64 GiB (GPU 1; 10.76 GiB total capacity; 7.77 GiB already allocated; 1.84 GiB free; 7.80 GiB reserved in total by PyTorch)
Lixin Xue commented
Could you give me a minimal example to reproduce this memory error so that I can check which parts are too large? If the K or D is too large, it could be the case that a points2_nn
of size (N, P, K, D) is just too large for the GPU.
Yuhao Zhu commented
D is 3, K is 50, and my input point cloud has about 10M points. the OOM
indeed happens inside the `frnn_gather` function, and i think yes it's
because `points2_nn` is too big.
i guess for 10M points and K of 50, points2_nn will have 500M neighbors,
each of which is a 3D point, then it's 500M * 3 * 4B, which is 6GB. is this
calculation correct? but this is done completely on the CPU right? not sure
why I got a GPU OOM...
Yuhao
…On Fri, Jul 2, 2021 at 1:03 PM Lixin Xue ***@***.***> wrote:
Could you give me a minimal example to reproduce this memory error so that
I can check which parts are too large? If the K or D is too large, it could
be the case that a points2_nn of size (N, P, K, D) is just too large for
the GPU.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#6 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHRSQPMD2BRZX7LDSCRWG3TVXWMZANCNFSM47XBID6Q>
.
Lixin Xue commented
Yeah, I think it is correct. My implementation should work for both GPU and CPU. So if the points
, idxs
, and the lengths
are GPU tensors, this will result in a GPU OOM.