RenderDoc Capture with Mesh shader payload causes GPU resets and system freezes
Firestar99 opened this issue · comments
Opening a RenderDoc Capture that has task and mesh shaders which utilize a payload to send data between them causes GPU resets and system freezes. I have found two very different ways of triggering it, both of them somehow related to amdvlk. These repo instructions assume a clean Ubuntu 24.04 system to start, so only RADV and no AMDVLK installed.
Standard AMDVLK Capture
- Install AMDVLK deb package on your system
- Open the AMDVLK capture
- Rarely the RenderDoc will freeze here already
- Select the single
vkCmdDrawMeshTasksEXT
call - Observe the RenderDoc window freezing, and a 5/5 system freeze up to ~1min later
Opening a RADV Capture while AMDVLK is just present but unused
- Open the RADV capture and observe Renderdoc working as expected
- Install AMDVLK deb package on your system
- Delete
/etc/vulkan/implicit_layer.d/amd_icd64.json
to remove theVK_LAYER_AMD_switchable_graphics_64
implicit layer, which forces you to always use the amdvlk driver - verify that
vulkanCapsViewer
can see both drivers, RADV withAMD Radeon Graphics (RADV REMBRANDT)
and amdvlk withAMD Radeon Graphics
(I wish amdvlk had a more identifiable name) - Open the same RADV capture again, but this time observe Renderdoc freezing, likely followed by a 5/5 GPU reset or rarely 1/5 system freeze
- With RADV you don't even need to select the draw itself, loading the capture is almost always enough.
=> My current conclusion is that opening a RADV capture and an amdvlk device being available, even though it is unused, is enough to cause the Renderdoc to freeze and a gpu reset to follow.
RenderDoc log in case you want to confirm that RenderDoc indeed uses RADV as the replay device, and AMDVLK just being present.
Related issues
baldurk/renderdoc#3309
https://gitlab.freedesktop.org/mesa/mesa/-/issues/11156