Understanding the use of TreadleTester in CPU simulation
learning-chip opened this issue · comments
Hi, thanks for this wonderful teaching project.
I'd like to understand the scala/Treadle-based simulation engine, as it allows faster prototyping than verilog-based simulation in RocketChip/Chipyard. However, I was having a hard time understanding the simulator source code, in particular CPUTesterDriver.scala
and simulate.scala
, partly because the Treadle page (https://www.chisel-lang.org/treadle/) has very few doc.
Here are my main questions. Thanks in advance!
1. What benefits does TreadleTester
offer over the vanilla ChiselTest?
Looking at the usage of TreadleTester
, it looks similar to the vanilla poke()
, step()
inside ChiselTest's test
block:
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 56 to 58 in 124cc11
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 64 to 68 in 124cc11
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 190 to 194 in 124cc11
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 241 to 247 in 124cc11
So, what feature will be missing if I just use the simple ChiselTest?
2. How does the RISC-V binary get loaded into instruction memory?
The relevant code I find is:
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 43 to 58 in 124cc11
However, I can't understand how the binary/instructions get passed to the i-mem in CPU, which locates in a totally separate file/module:
dinocpu/src/main/scala/memory/memory-port-io.scala
Lines 35 to 38 in 124cc11
dinocpu/src/main/scala/memory/base-memory-components.scala
Lines 39 to 40 in 124cc11
The loadMemoryFromFile
call looks similar to verilog's $readmemb
/$readmemh
that can load instructions for simulation. But I don't see how it is invoked via the top-level TreadleTester
...
3. Can the Treadle Execution Engine be used with other RISC-V testing frameworks?
In particular https://github.com/riscv/riscv-tests and https://github.com/ucb-bar/riscv-torture that provide a more complete functional coverage. From their docs they seem to require verilog simulators like Verilator or Synopsys VCS.
Related question:
https://stackoverflow.com/questions/55587524/simulating-a-cpu-design-written-in-chisel
So, what feature will be missing if I just use the simple ChiselTest?
One useful thing I find is TreadleTester.pokeMemory
that can modify the internal memory without requiring an explicit I/O interface (correct?).
Is it true that TreadleTester
can poke any signals while ChiselTest can only poke Input()
signals? (ref: https://stackoverflow.com/a/59292064)
Great questions! I'll do my best to answer them, but I'm probably not the best resource. A lot of this code was written quickly on a deadline and written about 2 years ago. I'll do my best to remember what I was thinking!
Also, thanks for your interest here!
So, what feature will be missing if I just use the simple ChiselTest?
One useful thing I find is
TreadleTester.pokeMemory
that can modify the internal memory without requiring an explicit I/O interface (correct?).Is it true that
TreadleTester
can poke any signals while ChiselTest can only pokeInput()
signals? (ref: https://stackoverflow.com/a/59292064)
Yes, I believe that's correct. IIRC, ChiselTest didn't support loadFromMemory
but Treadle did. I'm not sure if that's still the case.
It also had a more general simulation interface than the ChiselTest interface. Also, I believe I talked to the developers and they were planning on deprecating ChiselTest and moving towards only using Treadle.
2. How does the RISC-V binary get loaded into instruction memory?
Here's the relevant code:
The filename is passed through the configuration object. I believe the file is a text file with a hex word on each line, but I may be misremembering.
3. Can the Treadle Execution Engine be used with other RISC-V testing frameworks?
I don't see why not, at least in theory. The main impediment is that those tests assume that there the proxy-kernel (pk) running underneath to handle I/O and exceptions. The DINOCPU doesn't implement exceptions, though that is something I'm considering for the future :)
Treadle is just an RTL simulator written scala. If your RTL supports the testing frameworks, then I don't see why Treadle wouldn't.
One other thing I'll mention... You're right that the Treadle documentation isn't great. In fact, I think I'm using some internal APIs in my code here :). I figured most things out by using IntelliJ and reading the Treadle source code. Stepping through and doing live introspection was how I figured out most of the "API" that I'm using. I also have found that the Treadle developer (and the Chisel/FIRRTL developers more generally) are incredibly helpful. They even fixed some bugs in the code here for me!
Good luck! And let me know if there are any other questions I can (try to) answer.
Thank you for the thorough reply.
I believe I talked to the developers and they were planning on deprecating ChiselTest and moving towards only using Treadle.
Interesting... The ChiselTest page says that "if you’re fine living on the bleeding edge, give it a try", so I thought they are advocating ChiselTest instead of deprecating it. Also, the TreadleTester page says "it will be one of the standard back-ends available as part of the chisel-testers project", so it appeared to me that users will access TreadleTester via the higher-level ChiselTest interface😂
The filename is passed through the configuration object.
Ah, I found the relevant code, which confused me at first:
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 24 to 34 in 124cc11
So the filename is initially specified by SimulatorOptionsManager
, which extends TreadleOptionsManager
class (again, undocumented😂). The filename is then passed to CPUConfig
, which can initialize a certain type (e.g. pipeline or single-cycle) of CPU module. With the conf
variable that contains all necessary CPU parameters (including memory file path), we can build the simulator (equivalent to the DUT c
in ChiselTest) by the following calls in CPUTesterDriver.scala
:
val compiledFirrtl = build(optionsManager, conf)
val sourceAnnotation = FirrtlSourceAnnotation(compiledFirrtl)
val simulator = TreadleTester(sourceAnnotation +: optionsManager.toAnnotationSeq)
(Again, found no doc on FirrtlSourceAnnotation
😂)
The rest of the test process is easy to understrand as the simulator
behaves like the DUT c
in ChiselTest.
So, my question is, what's the benefit of using a dedicated TreadleOptionsManager
class to configure the CPU and test? From my limited Chisel experience, I would simply define a bunch of parameters in the top-level CPU module, and initialize the test following the TreadleTester example:
val s = Driver.emit(() => new MyCPU(myConfiguration))
val tester = TreadleTester(s)
// then just like normal ChiselTest...
tester.poke(...)
tester.peek(...)
What's the limitation of this simple approach? Could you recommend any resources on the coding practice for a complicated chisel project?
those tests assume that there the proxy-kernel (pk) running underneath to handle I/O and exceptions.
Do you mean something like https://github.com/riscv/riscv-pk/? For simple instructions it should be fine then...
I also have found that the Treadle developer (and the Chisel/FIRRTL developers more generally) are incredibly helpful.
Good to know, for more general questions I will post on their GitHub issues :)
I also hit a bug with TreadleTester.poke
. Not sure if I should ask here or on Treadle issues.
Basically I tried to modify the internal register state, following this code segment:
dinocpu/src/main/scala/testing/CPUTesterDriver.scala
Lines 64 to 68 in 124cc11
My code looks like:
// inside module
val reg = Reg(UInt(32.W))
val regs = Reg(Vec(4, UInt(32.W)))
...
// during test
tester.poke("reg", BigInt(1))
tester.poke("regs_0", BigInt(1))
The full code is https://gist.github.com/learning-chip/f052dea8f83780e98c87c715122e4f8e
(can run in the online notebook of https://github.com/freechipsproject/chisel-bootcamp)
I got the error message treadle.executable.TreadleException: setValue: Cannot find reg in symbol table
. But how can I inspect the symbol table then?
Weirdly, pokeMemory
with a similar syntax works well:
val mem = Mem(4, UInt(32.W))
...
testerMem.pokeMemory("mem", addr, BigInt(value))
Interesting... The ChiselTest page says that "if you’re fine living on the bleeding edge, give it a try", so I thought they are advocating ChiselTest instead of deprecating it. Also, the TreadleTester page says "it will be one of the standard back-ends available as part of the chisel-testers project", so it appeared to me that users will access TreadleTester via the higher-level ChiselTest interfacejoy
I must be misremembering... they deprecated something...
What's the limitation of this simple approach? Could you recommend any resources on the coding practice for a complicated chisel project?
TBH, that could work. As I mentioned before, I wrote most of this code a couple of years ago (or maybe last year). Treadle, at the time, was quite new. The APIs could have been cleaned up.
I got the error message
treadle.executable.TreadleException: setValue: Cannot find reg in symbol table.
But how can I inspect the symbol table then?
Aha! This I can answer with confidence!
FIRRTL pretty aggressively optimizes out unused wires and registers. So, if there's either a bug in your code or if you're not implementing the whole execution core it will often optimize out the registers (they are "unused" in its mind). When it does this, my hardcoded values for poking the registers fail.
There are two solutions: 1) The hacky approach is to add printf
statements to force the wires to be kept. 2) The correct approach is to use a FIRRTL annotation which tells the compiler not to optimize the wire/register away with dontTouch
. Here's an example: https://github.com/jlpteaching/dinocpu-wq21/blob/main/src/main/scala/single-cycle/cpu.scala#L17
The correct approach is to use a FIRRTL annotation which tells the compiler not to optimize the wire/register away with
dontTouch
.
Hmm... I wrapped the internal regs by:
val reg = dontTouch(Reg(UInt(32.W)))
val regs = dontTouch(Reg(Vec(4, UInt(32.W))))
But still got the same error Cannot find reg in symbol table
. Could you see another possible causes?
Updated code: https://gist.github.com/learning-chip/f052dea8f83780e98c87c715122e4f8e , again runs in the bootcamp online notebook.
Another related question: besides Reg
, sometimes Mem
can also be optimized away by FIRRTL. However, using dontTouch
on Mem
leads to error:
inferred type arguments [chisel3.Mem[chisel3.UInt]] do not conform to method apply's type parameter bounds [T <: chisel3.Data]
Adding an output port can prevent the Mem
from being optimized away (without using dontTouch
). The CoreIO
module in your project seems to serve this purpose (exposes Mem's IO interface to top-level CPU module). However, say I want to define a CPU module without exposing memory IO port, like this:
class Memory extends Module {
....
val mem = Mem(4, UInt(32.W)) // internal variable
...
}
class Cpu extends Module {
...
val memory = Module(new Memory) // internal variable
...
}
In this case, the memory is optimized away, and pokeMemory
fails:
val testerError = TreadleTester(Driver.emit(() => new Cpu(outputMem=false)))
testerError.pokeMemory("memory.mem", 0, 1)
// Error: treadle.executable.TreadleException: Error: memory memory.mem.forceWrite(0, 1). memory not found
Full code: https://gist.github.com/learning-chip/43f11fc7c57daff44fdf437ce0151fc5
Is there a correct way to combine dontTouch
and Mem
?
One important thing to note is that the error you encounter states that dontCare
expects a type T <: chisel3.Data
, which is Scala's way of saying that T
must be a subtype of Data
.
Reg
is a special case where the object technically isn't a subtype of Data
, but its apply()
method returns a T <: Data
object. So for all practical purposes any instance of val x = Reg(...)
can be treated as a Data
type, and dontTouch()
will work on it.
Mem
isn't a Data
; its apply()
returns a Mem[T <: chisel3.Data]
. So dontTouch()
will not work on it.
In your example, the Mem
does not have any inbound or outbound signals, so from the perspective of dead code elimination it is 'safe' to optimize away since this would not introduce any side effects to the circuit. And, since DCE occurs during the compilation into FIRRTL, you can't really use TreadleTester
to poke this memory because DCE would have already happened by this point.
So, you must necessarily connect the Mem
to something, usually the module's IO
. The good thing about this is that the apply()
of an IO
returns a T <: Data
, so you can just dontTouch
the IO
, and as long as it's properly wired the Mem
should not be optimized away.
So, you must necessarily connect the
Mem
to something, usually the module'sIO
. The good thing about this is that theapply()
of anIO
returns aT <: Data
, so you can justdontTouch
theIO
, and as long as it's properly wired theMem
should not be optimized away.
Thanks, that's a very useful explanation