About how herd models RISC-V memory model primitives

Question

About how herd models RISC-V memory model primitives

DehengYang opened this issue a year ago · comments

Dear Authors,

Sorry for interrupting you here. I am a newcomer to Herd, and when I am reading the source code of Herd for formally modeling memory consistency of ISAs (e.g., RISC-V) via CAT language. I am confused about some parts of the source code, which I would like to present for your help:

herd/libdir/riscv-defs.cat: I understand that the RISC-V fences consists of ten fences, but for the following code:

let fence.r.rw = [R];fencerel(Fence.r.rw);[M]

I fail to find any definition of Fence or Fence.r.r within the project. Also, I fail to find the corresponding definition of [R], [W], or [M]. Where can we find the definition of these variables? Thank you!

for let RCsc = (Acq|Rel|AcqRel) & (AMO|X) and let fencerel(B) = (po & (_ * B)) ; po Where can we find the definition of the X, B, and _ variable? I checked stdlib.cat but there is no related definition...
for let AMO = try AMO with (R & W) (* Compat *), I have read your paper "Syntax and semantics of the weak consistency model specification language cat" and noticed that & means intersection. But it seems the intersection of R (i.e., Read events) and W (i.e., Write events) should be an empty set. How should we interpret this code?

It would be sincerely appreciated if any comments or guidance could be provided. Thank you in advance for your time and great patience!

Yours sincerely,
Deheng

Luc Maranget · Answer 1 · Tue Apr 25 2023 23:55:07 GMT+0800 (China Standard Time)

Dear Authors,

Sorry for interrupting you here. I am a newcomer to Herd, and when I am reading the source code of Herd for formally modeling memory consistency of ISAs (e.g., RISC-V) via CAT language. I am confused about some parts of the source code, which I would like to present for your help:

herd/libdir/riscv-defs.cat: I understand that the RISC-V fences consists of ten fences, but for the following code:
let fence.r.rw = [R];fencerel(Fence.r.rw);[M]
I fail to find any definition of Fence or Fence.r.r within the project. Also, I fail to find the corresponding definition of [R], [W], or [M]. Where can we find the definition of these variables? Thank you!

All those bindings are parts of the "initial environment" that OCaml code defines before calling the CAT interpreter.

The sets R, W and M are defined for all architectures in the file machModelChecker.ml, here. Of course there are many other definitions there

Some bindings, such as Fence Fence.r.r are architecture specific, (here RISCV specific) they are defined arch specific files, name line XXXArch_herd.ml, where XXX stands for the conventional architecture name, .e.g. RISCV. More specifically the RISCV "fence" sets are defined here. More precisely, the code above defines the OCaml list of pairs barrier_sets, which the generic code in machAction uses to construct a new list of pairs arch_sets here, which the code in file machModelChecker.ml finally uses to create the initial environment here.

for let RCsc = (Acq|Rel|AcqRel) & (AMO|X) and let fencerel(B) = (po & (_ * B)) ; po Where can we find the definition of the X, B, and _ variable? I checked stdlib.cat but there is no related definition...

You'll find the definition of X in herd/AArch64Arch_herd.ml here, as X (set of atomic accesses) is AArch64 specific. The variable B is the formal argument of the function fencerel, hence it is defined in the code above in some sense. Finally, _ is not a variable but some kind of keyword that is parsed as the Universe constant here

for let AMO = try AMO with (R & W) (* Compat *), I have read your paper "Syntax and semantics of the weak consistency model specification language cat" and noticed that & means intersection. But it seems the intersection of R (i.e., Read events) and W (i.e., Write events) should be an empty set. How should we interpret this code?

This case is very specific: some variants of the C model express read-modify-writes action as an event that is both a read and an write. No hardware model is similar: their read-modify-write constructs generate a read and a write event.

It would be sincerely appreciated if any comments or guidance could be provided. Thank you in advance for your time and great patience!

Yours sincerely, Deheng

You are welcome

Deheng Yang · Answer 2 · Sat May 06 2023 16:36:42 GMT+0800 (China Standard Time)

Dear Prof. Maranget @maranget ,

Thank you so much for the detailed guidance. These comments are extremely useful in presenting a comprehensive view of Herd's modeling process. Thank you!

I would also like to apologize for my late reply. These days I have tried to debug Herd and further encountered some problems. I find there are two ways to debug Herd project:

In command line: ocamldebug /home/apr/herdtools7/_build/default/herd/herd.bc -show prop /home/apr/herdtools7/herd/tests/instructions/RISCV/A01.litmus
In VSCode: install ocamlearlybird, OCaml Platform extensions and then debug /home/apr/herdtools7/_build/default/herd/herd.bc -show prop /home/apr/herdtools7/herd/tests/instructions/RISCV/A01.litmus

I currently used the second debugging approach as it is more user-friendly (with an interface) and sometimes use the first approach as the breakpoint in VSCode debugging sometimes does not work (e.g., breakpoint at Herd.ml:255, this is weird). But I found it difficult in figuring out how Herd works, my debugging flow and corresponding problems are as follows:

entry point (parameter parser): _build/default/herd/herd.ml 453 (code: Arg.parse options get_cmd_arg ... ) that parses the given parameters and then store the values in opts (see opts.ml)
_build/default/herd/herd.ml 686 (code: try from_file name seen) that starts to read and parse the litmus test file A01.litmus

Q1: How does this line of code (fun name seen -> try from_file name seen) get the value of name and seen? I observed in debugging that the function fun name seen is invoked but I cannot figure out how this function obtains the two parameters.
parseTest.ml 33(code: let do_from_file start_time env name chan) that already opens A01.litmus file as an in_channel.

Q2: When I use print chan in terminal, it prints: chan: in_channel = <abstr>, so how can I access the information of the chan instance?
parseTest.ml 54(code: if Conf.check_name tname then begin)

Q3: I understand that Conf is an abstract signature but I cannot figure out the concrete method of Conf.check_name(RunTest.Config just declares it without a concrete function body).
getModel.ml 31(code: let parse archcheck arch libfind variant model) that invokes Model.get_default_model to obtain riscv.cat
parseTest.ml 111(code: ``RISCV -> let module X = RISCVParseTest.Make(Conf)(ModelConfig) in X.run cache_type dirty start_time name chan env splitted) that starts to invoke X.run`

Q4: When I run step(i.e., s) and backtrace(i.e., bt) in terminal, the Herd project suddenly jumped to SymbConstant module, without any clues about how it jumps to this place... The output is as follows:

Time: 48879 - pc: 0:6821208 - module ParseTest
111 <|b|>let module X = RISCVParseTest.Make(Conf)(ModelConfig) in
s
Time: 48880 - pc: 0:4514832 - module SymbConstant
43 <|b|>Constant.pp (Scalar.pp hexa) (PteVal.pp hexa) pp_instr_cst
bt
Backtrace:
0 SymbConstant lib/symbConstant.ml:43:5
1 SymbConstant lib/symbConstant.ml:47:22
(Encountered a function with no debugging information)
**Core Logic of Herd simulation:** topHerd.ml 351(code:let run start_time test =)
topHerd.ml 353(Code: let { MC.event_structures=rfms; MC.overwritable_labels=owls; },test =)

Q5: Although I used print rfms/print MC, I failed to figure out the definition and concrete value of rfms...
topHerd.ml 503(code:do_show () ;) that finishes simulation and prints result into terminal.

I observed that many code were executed between topHerd.ml 351 and topHerd.ml 503, but the concrete exeuction path is still difficult for me to understand.

Following your advice in the above comment, I have got a bit more familiar with Herd code logic. But there are still many questions ahead for me to overcome due to the large size of Herd project in terms of source code. The whole project of Herdtools suite has reached 100,335 OCaml lines and the herd folder has reached 21,913 OCaml lines.

As a fan of Herd, I would sincerely appreciate it if any guidance could be provided about the above questions or a smooth workflow to learn Herd source code could be recommended.

I am sorry for my lengthy questions... Thank you very much for your great help.

Yours sincerely,
Deheng