JVM hangs at safepoint synchronization when profiling some large applications
AlexVanGogen opened this issue · comments
Sometimes I get the following thread states:
- profiler thread blocks at safepoint when executing
jvmti->GetLineNumberTable
and holdingframe_lock
; - one user thread in signal handling holds
in_scope_lock
and spins on acquiringframe_lock
; - some other user threads in signal thread spin on acquiring
in_scope_lock
.
According to gdb, user threads were interrupted while being already blocked at safepoint, and these threads are unable to block once more. Running application with -XX:+SafepointTimeout
and other related flags agrees with gdb and reports that the same user threads which spin in signal handler cannot reach safepoint.
Tightening critical section in profiler code so that it enters section only to work with static_call_frames
fixes this problem, but not entirely I guess -- this just harshly reduces the chance of its occurrence.
It's odd that we're stuck at jvmti->GetLineNumberTable
, but I'm pretty sure (as you said on gitter), we don't need to be holding frame_lock
past the for
loop at https://github.com/Decave/JCoz/blob/master/src/native/profiler.cc#L385. frame_lock
really just needs to protect static_call_frames
, which is used to collect call frames in the user threads when an experiment isn't running.