Memory leak in Wgpu backend

Question

Memory leak in Wgpu backend

joshhansen opened this issue 4 months ago · comments

Describe the bug
Memory leaks when using the Wgpu backend, but not when using the NdArray backend.

To Reproduce
A minimal reproduction is available at https://github.com/joshhansen/burnleak.

Check out the repository and cargo run --release on the master branch. Watch the memory usage. In my case, it climbs steadily upward at a rapid pace.

Then check out the master_ndarray branch and run it again. In my case, the memory usage does not climb.

Running on Wgpu but with the WgpuDevice::Cpu device slows but does not eliminate the memory leakage.

Expected behavior
The program should run without a memory leak on the Wgpu backend like it does on the NdArray backend.

Screenshots
n/a

Desktop:

OS: Linux Mint 21.3, kernel 6.5.0 x86_64
Burn: 0.13.2
Rust: 1.76.0
Nvidia driver 545.29.06-0ubuntu0.22.04.2

Additional context
Heaptrack memory leak profile:

Kurt Lawrence · Answer 1 · Sat Jul 20 2024 13:05:04 GMT+0800 (China Standard Time)

I'm experiencing the same leak when I upgraded from burn 0.12 to burn 0.13.

I tried your repro repo using burn 0.12 as a dep and there is not an aggressive leak, so it seems like a regression.

This is what I saw with my personal hobby project monitoring

Josh Hansen · Answer 2 · Sun Jul 21 2024 04:42:25 GMT+0800 (China Standard Time)

Thanks for the quick repro @kurtlawrence . I can confirm that downgrading to 0.12.1 results in much better memory usage - there are still leaks coming from wgpu but less by an order of magnitude. (82.6 MB leaks on 1.4 G of heaptrack samples vs. 2.1 GB of leaks on 626 MB of heaptrack samples)

mepatrick73 · Answer 3 · Tue Jul 23 2024 23:31:51 GMT+0800 (China Standard Time)

I've been working on the memory management strategy currently implemented in the wgpu runtime on the master branch. The current approach results in higher average memory usage, which is intentional. The new strategy is designed to be lazier in freeing unused memory and more aggressive in reusing it. For dynamic memory workloads, this can lead to performance improvements of up to 60%.

Our assumption is that for most deep learning use cases, the GPU device will be primarily dedicated to the training or inference tasks being performed. Therefore, we've prioritized better performance at the cost of higher average memory usage. While we don't believe this strategy leads to significantly higher peak memory usage or more frequent out-of-memory situations, we recognize that this could be a potential issue.

If average memory usage is a concern, we could consider adding an option for users to configure the memory management behavior in the wgpu runtime.

Nathaniel Simard · Answer 4 · Wed Jul 24 2024 21:52:01 GMT+0800 (China Standard Time)

@kurtlawrence Is it CPU RAM leakage or GPU memory?

Kurt Lawrence · Answer 5 · Thu Jul 25 2024 14:14:13 GMT+0800 (China Standard Time)

@kurtlawrence Is it CPU RAM leakage or GPU memory?

CPU RAM leaking

Josh Hansen · Answer 6 · Fri Jul 26 2024 15:12:25 GMT+0800 (China Standard Time)

@mepatrick73 The issue is not the level of memory usage, it's that the memory usage grows without bound. Eventually it would consume all system memory if I didn't terminate the program. I have 128GB system RAM and 4 GPUs with 48 GB VRAM. The task in my case is inference on a small model - each instance consumes 28 MiB of VRAM as per nvidia-smi.