Byte-Lab / JCoz

JCoz -- A Java causal profiler

Home Page:http://decave.github.io/JCoz/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JCoz sometimes reports strange delays in experiment results

AlexVanGogen opened this issue · comments

commented

The total delay in experiment result can be obscure sometimes. For example, there can be non-zero delays on baseline; once there was a delay that was greater than experiment duration at all.

It looks like in that case no one signal that resets thread-local delays is handled by thread. Such thread might handle the last signal received during experiment for a very long time, so that even the next experiment has time to be prepared. This causes the next experiment to run with stale thread-local delays, which affects the global delay, and, if some thread yet has nullified local delay, then it will fall asleep in signal handler although it isn't supposed to.

I think the solution here is to use a thread barrier between signaled user threads and the agent thread running an experiment.

One simple thing we can do here is add an atomic that the agent thread initializes to 0 before signaling, has each user thread atomically increment before returning from a signal handler, and then waits on before returning from signal_user_threads.