May be a icache miss thread still can be scheduled again?
zhangliang1997 opened this issue · comments
Hi jbush001:
I think i find a bug, but i'm not sure. In ifetch_tag_stage.v, there are some codes about thread scheduled.
assign can_fetch_thread_bitmap = ts_fetch_en & ~icache_wait_threads;
assign thread_sleep_mask_oh = cache_miss_thread_oh & {`THREADS_PER_CORE{ifd_cache_miss}};
assign icache_wait_threads_nxt = (icache_wait_threads | thread_sleep_mask_oh)
& ~l2i_icache_wake_bitmap;
I think a icache miss thread still can be scheduled, for example:
In cycle n, if we select 0 thread to fetch instruction;
In cycle n+1, 0 thread icache miss in ifetch_data_stage, in the same time(cycle n+1), we will select a thread to fetch instruction. Because 0 thread icache miss, so in n+1 cycle, 0 thread can't have opportunity to be scheduled or arbitrated.
So i think can_fetch_thread_bitmap should be:
assign can_fetch_thread_bitmap = ts_fetch_en & ~icache_wait_threads & ~thread_sleep_mask_oh;
So is there any problem about these code ? or just i don't have a thorough understand about code?
Good catch! Your observation is correct that it may schedule the thread unnecessarily if there is a miss, and that adding ~thread_sleep_mask_oh would prevent that. It actually used to be implemented the way you suggested. The reason I changed it is that it ended up being a long combinational path that significantly reduced the maximum clock speed. My workaround was to invalidate the instruction (which injects a bubble, but is otherwise correct). This wastes a clock cycle, but should be relatively infrequent, and my premise is that the improvement in clock speed would outweigh that.
Here is the change where I removed that logic:
Later I removed even more:
I was not able to think of a way to reduce the combinational delay in that logic and avoid this.
Thanks very much ! I have understanded it.
By the way, i learned a lot from nyuzi, such as the verilog implement of the gpgpu. Before i find nyuzi, i just learned gpgpu architecture from gpgpusim. After l read nyuzi code, l learn a lot "real" things. This is a very cool project!
Thanks again!
Thanks, I'm glad you found it useful! :)