NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Home Page:https://nvidia.github.io/warp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPU memory leaked when destructing warp.Mesh

MaxWipfli opened this issue · comments

We noticed that GPU memory usage increases when repeatedly creating (and destroying) warp.Mesh objects.

Minimal Example:

import warp as wp
import pynvml       # pip install pynvml

pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)
wp.init()

device = "cuda:0"
points = wp.array([[0, 0, 0], [1, 0, 0], [0, 1, 0]], dtype=wp.vec3, device=device)
indices=wp.array([0, 1, 2, 0, 1, 2, 0, 1, 2], dtype=wp.int32, device=device)

for i in range(10_000_000):
    if i % 100_000 == 0:
        gpu_ram_usage = pynvml.nvmlDeviceGetMemoryInfo(handle).used / 1024 ** 2
        print(f"iter = {i:8d}, VRAM usage = {gpu_ram_usage:.0f} MiB")
    mesh = wp.Mesh(points, indices)

Output:

   CUDA Toolkit 12.3, Driver 12.3
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "NVIDIA GeForce RTX 2080 SUPER" (8 GiB, sm_75, mempool enabled)
[...]
iter =     0k, VRAM usage = 521 MiB
iter =   100k, VRAM usage = 565 MiB
iter =   200k, VRAM usage = 629 MiB
[...]
iter =  1900k, VRAM usage = 1429 MiB

As can be seen easily, the GPU memory usage increases steadily, despite the created Mesh being destroyed immediately.

The has been tested on the lastest main commit (ebcc90d). There is no host memory leak when using device = "cpu", as far as we can tell.

After an initial investigation, the problem seems to be the following:

  • When creating a mesh (in mesh_create_device), a BVH is created as follows:

    warp/warp/native/mesh.cu

    Lines 211 to 212 in ebcc90d

    uint64_t bvh_id = bvh_create_device(mesh.context, mesh.lowers, mesh.uppers, num_tris);
    wp::bvh_get_descriptor(bvh_id, mesh.bvh);
  • When destroying the mesh again (in mesh_destroy_device), the BVH is destroyed as follows:
    wp::bvh_destroy_device(mesh.bvh);

During creation, the following memory block is allocated on the device (in bvh_create_device):

wp::BVH* bvh_device = (wp::BVH*)alloc_device(WP_CURRENT_CONTEXT, sizeof(wp::BVH));

This allocation does not have a corresponding free_device() call and is thus leaked.

I am not well-versed enough with this code base to propose a nice fix. However, here is a "hacky" patch that resolves the problem: https://gist.github.com/MaxWipfli/3197354809752d377dd90bbd108e1992

Thanks @MaxWipfli, nice catch! Your fix is on the right track. I'll take a closer look and we'll get this leak patched up asap.

Fix is now in main