OpenXiangShan / XiangShan

Open-source high-performance RISC-V processor

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ensuring RVWMO Compliance for XS: Missing aq/rl Annotation Implementations?

GTwhy opened this issue · comments

Before start

PLEASE MAKE SURE you have done these:

  • (Select what you have done like this)
  • I have read the RISC-V ISA Manual and this is not a RISC-V ISA question.
  • I have read the XiangShan Documents.
  • I have searched the previous issues and did not find anything relevant.
  • I have searched the previous discussions and did not find anything relevant.
  • I have reviewed the commit messages from the relevant commit history.

Describe the question

A clear and concise description of your question.

Hi, I am working on some RVWMO verification tasks based on XiangShan (Nanhu, 2d7581b). I noticed that XiangShan seems not to have implemented the aq/rl annotation for memory operations. How does XiangShan ensure compliance with the relevant PPO requirements? I found that there was related work in commit 29f8af8, but it appears to have been overlooked. If there are other implementations that I have missed, please point them out. Thank you!

To simplify the complexity of the hardware implementation, we did ignore the aq/rl bits. This does not mean that we violate RVWMO but that the memory consistency model we implement will be stricter than RVWMO (at the expense of some synchronization costs of course)

So, are all relevant instructions designed based on the effects after applying the strictest annotations? I haven't looked closely at the AMO-related code yet. I see that the implementation of the fence is as described in the manual's implementation guide (interpret all fences as if they were FENCE RW,RW).

Another question is about the implementation of the fence, which seems to mainly flush the store buffer, ensuring the order of all stores into the global sequence. But how is the order of loads ensured? For example, in "ld1-fence-ld2," I don't see a mechanism to prevent ld2 from executing(get value) ahead of ld1.

  1. Yes
  2. The fence instruction will be decoded with three flags (noSpecExe, blockBackward, flushPipe), which are non-speculative execution, preventing subsequent instructions from entering the out-of-order window, and refreshing the pipeline. This allows it to maintain the order of loads. See:
    FENCE -> List(SrcType.pc, SrcType.imm, SrcType.X, FuType.fence, FenceOpType.fence, N, N, N, Y, Y, Y, SelImm.X),

Understood, thank you very much for your response! @wakafa1