apitrace / apitrace

trace: https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/-/merge_requests/54/diffs?commit_id=7ca7929955fbe4e95a27515ee8b19e3007afeda7

eglretrace --headless --benchmark --singlethread --loop=150 --pframes opengl:GPU\ Duration blender/blender-base.trace

After ~ 100 runs of this small trace, memory on 16GB laptop gets exhausted. My guess would be something isn't freed?

Tested on master branch and 11.1.

qapitrace behaves same. When clicking on the frame 1 manually, it start exhausting memory.

EDIT: with qapitrace I'm seeing the memory drop for other traces too.

Regarding --loop, this is indeed a small trace, in fact too small:

$ apitrace dump --grep='glXSwapBuffers|glCompileShader' blender-base.trace 
1995 glXSwapBuffers(dpy = 0xe624a200, drawable = 83886130)

3306 glCompileShader(shader = 2)
3311 glCompileShader(shader = 3)
3386 glCompileShader(shader = 5)
3391 glCompileShader(shader = 6)
3437 glCompileShader(shader = 8)
3442 glCompileShader(shader = 9)
[....]
10149 glCompileShader(shader = 204)
10646 glCompileShader(shader = 206)
10651 glCompileShader(shader = 207)
10787 glCompileShader(shader = 209)
10792 glCompileShader(shader = 210)
10855 glXSwapBuffers(dpy = 0xe624a200, drawable = 83886082)

The last frame is the one where all resources are created. There should be no surprise that looping over the last frame causes leaks.

I keep emphasising that options such as --loop are not magical, they are as dumb as it gets in fact, as apitrace will bindly replay the calls in question, with no regard for leaks, or dead resources, or what not.

The only thing actionable here for me is to have apitrace emit a loud warning when --loop is used.

Regarding qapitrace, I don't understand the problem. I opened the trace with qapitrace and saw no evidence of a problem. qapitrace uses a lot of memory as it keeps most calls in memory, still, this trace is quite small and should pose no significant issue as far as memory consumption is concerned.

The only thing actionable here for me is to have apitrace emit a loud warning when --loop is used.

In fact there's a warning there already, on

apitrace/retrace/retrace_main.cpp

Line 1405 in 5643863

    
           std::cerr << "warning: --loop blindly repeats the last frame calls, therefore frames might not necessarily render correctly (https://github.com/apitrace/apitrace/issues/800)" << std::endl;

I'd like to avoid folks having unrealistic expectations from using --loop but it's not obvious how I can make the warning better. @okias, if you have suggestions on how to improve this warning let me know.

Btw. for me, on some frames replayed by click in apitrace (probably same codepath as replayer banging frame again and again), I'm having a 300M drop in memory. 150 * 0.3G = ~ 45G I know it's not a magic, but it's a quite limiting for replaing larger traces. I tried valgrind and it also fired some warnings from iris driver, but it's first time I used valgrind, so I'll have to examine (but here is the dump, which I'm trying to interpret)

valgrind_memory_libgdx_trace.txt

probably same codepath as replayer banging frame again and again

It should be different code-paths, as qapitrace doesn't replay calls itself, instead it calls glretrace for replaying, so even if glretrace leaks, that memory is freed when glretrace exits. What causes qapitrace to use more memory are the state dumps and profile stats -- the more one does, the more memory it will take. It's possible there are some leaks somewhere, but the for large traces large memory usage is essentially unavoidable.

valgrind_memory_libgdx_trace.txt

This is very difficult to analyze these without more context. Intel driver can have some minor leaks. Apitrace might have some minor leaks. It's also possible that the trace is incomplete or the application didn't tear down all GL objects cleanly, therefore causing apitrace and intel driver to leak.

Because of this, it's difficult to act on valgrind output. In short, trying to get a clean valgrind output seems a hopeless to me, but if qapitrace/glretrace do run out of memory (and that's not due to --loop ) then that's something I'd be keener to investigate and address.

blender-base.trace --loop 150 exhaust memory of regular 16G laptop