[amdgpu] quitting a compositor often locks the driver up
valpackett opened this issue · comments
Lockups on quitting the window server (making the window server completely unkillable) on amdgpu have been an issue since forever and it's still an issue on 5.2.
Currently the hanging stack trace when quitting wayfire looks like:
exit1
→ fdescfree
→ … → linux_file_close
→ drm_release
→ drm_file_free
→ amdgpu_driver_postclose_kms
→ amdgpu_vm_fini
→ drm_sched_entity_destroy
→ drm_sched_entity_flush
→ linux_wait_event_common
→ linux_add_to_sleepqueue
→ sleepq_wait_sig
→ sleepq_catch_signals
→ mi_switch
upd: huh, drm_sched_entity_flush
is the source of the infamous ==========> BUG: entity->rq->sched is NULL
print
upd: so this is supposed to time out in 1000ms but doesn't
oh, that was easy: https://reviews.freebsd.org/D25509
Landed as rS362829