m-labs / nmigen-soc

System on Chip toolkit for nMigen

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Control/status registers

whitequark opened this issue · comments

Let's collect requirements for CSRs here.

Overview of oMigen design:

  • Two kinds of registers: CSRStatus (R/O) and CSRStorage (R/W);
  • Located on the dedicated CSR bus;
  • Does not support atomic reads or writes;
  • Includes ad-hoc support for generating C and Rust code, and dumping to CSV;
  • Does not have any support for generating documentation;
  • AutoCSR collects CSRs from submodules via Python introspection;
  • CSRConstant allows specifying limited supplementary data in generated code.

According to @sbourdeauducq, the primary semantic problem with oMigen CSR design is that it lacks atomicity, and fixing that would involve ditching the CSR bus and instead using Wishbone directly.

According to @mithro and @xobs, a significant problem with oMigen CSR design is that it does not allow generating documentation.

According to @whitequark, a problem with oMigen CSR design is that the code generation is very tightly coupled to MiSoC internals, and is quite hard to maintain.

Here's my proposal for a new CSR design:

  1. I think that the CSR storage element itself should be completely separate from any bus used to access it. If a CPU will be wired to access some CSRs, it is the responsibility of that CPU to collect CSRs and connect that to any bus appropriate for that CPU.
  2. The adapter to this bus should provide atomicity guarantees.
  3. Instead of being a bare Signal, a CSR should be defined as a series of fields, similar to a Record but in a more structured way. That includes attaching documentation to individual bit fields.
  4. To simplify integration and improve safety, it should be possible to specify valid values for a field: it could be a enumeration, or a range. (This is similar to what SVD supports.)
  5. Similarly, the reset value should be a property of the field.
  6. The hierarchical structure of CSR banks and the complete set of field properties should be exportable in a structured format like JSON. Optionally, it should also be exportable as SVD, so that the Rust code generator does not need to be rewritten, and svd2rust could be reused. (SVD is notionally ARM-specific, but not really restricted to ARM in an inherent way; it is very flexible.)
  7. The replacement for CSRConstant should be more structured. At the very least, it should be associated with a specific CSR bank, so that languages with modules like Rust can place it correctly.
  8. Instead of AutoCSR, I propose to use a mechanism similar to hdl.dsl._ModuleBuilderSubmodules. I suggest that adding a new CSR would look something like:
    self.csrs = CSRBank()
    self.csrs.baud = CSRRegister("baud", [
        CSRField("value", 16, reset=9600)
    ])
    or perhaps even something based around with constructs.
  9. Debatable: I propose to unify CSRStatus and CSRStorage, and instead use per-bit writability. This is partly to simplify the interface (there is only one kind of CSR that needs to be dealt with), and partly so that atomic reset operations on flag bits would be possible using the r_c1 bit type.

I've already put together a simple wrapper around CSRs that I initially called dreg: https://github.com/xobs/dreg/blob/master/dreg.py

It contains things such as values, which is a list of (value, description) tuples that let you specify values in a freeform manner. This would look something like:

                    values=[
                        ("0b0000", "disable the timer"),
                        ("0b0001", "slow timer"),
                        ("0b1xxx", "fast timer"),
                    ]

I made the rule that all Dreg Fields must be valid Python identifiers, except they must not start with "_". Then I expose various values under "_".

For example, RegStorage has a parameter you can pass to it resettable=True that will wrap the underlying CSRStorage in a ResetInserter(). This allows the design to strobe reg._reset in order to reset the underlying CSR storage object.

I also added pulse=True as a Field parameter that will cause that particular signal to only be active when _we is 1. This is particularly useful for Go events, such as starting a particular operation.

Finally, RegStorage and RegStatus expose all of their fields as Signals that can be accessed.

For example, with this RegStatus:

        self.status = status = RegStatus(
            Field("have", description="`1` if there is data in the FIFO."),
            Field("is_in", description="`1` if an IN stage was detected."),
            Field("epno", size=4, description="The destination endpoint for the most recent SETUP token."),
            Field("pend", description="`1` if there is an IRQ pending."),
            Field("data", description="1` if a DATA stage is expected."),
            description="Status and diagnostic information about the SETUP handler",
        )

You can then access status.pend and get the "pending" bit.

For example, RegStorage has a parameter you can pass to it resettable=True that will wrap the underlying CSRStorage in a ResetInserter().

Shouldn't all CSRs be reset when the underlying core is reset as well? Having a separate reset just for registers seems odd.

I also added pulse=True as a Field parameter that will cause that particular signal to only be active when _we is 1. This is particularly useful for Go events, such as starting a particular operation.

This seems like a desirable feature.

You can then access status.pend and get the "pending" bit.

I agree that an instantiated CSR should be "Record-like", possibly lowered to a Record.

Shouldn't all CSRs be reset when the underlying core is reset as well? Having a separate reset just for registers seems odd.

Yes, however it's sometimes important to have additional resets. For example, in my current design there is a USB address register that's a CSRStorage that contains the current USB address. When the host does a USB reset, it is necessary to reset this address. So I wire up the usb_reset signal to address._reset in order to clear the value.

I agree that an instantiated CSR should be "Record-like", possibly lowered to a Record.

One issue I'm running into -- in litex, several values are assumed to exist on a CSR. For example, name. That means that it is no longer possible to have a Field called name. The same holds true for storage, status, we, and re, among other reserved names.

When the host does a USB reset, it is necessary to reset this address. So I wire up the usb_reset signal to address._reset in order to clear the value.

I think this might be better addressed with a handshake mechanism. But I will consider this use case.

One issue I'm running into -- in litex, several values are assumed to exist on a CSR. For example, name. That means that it is no longer possible to have a Field called name. The same holds true for storage, status, we, and re, among other reserved names.

We can use the same trick as in Module: we could have all the fields under csr.field, e.g. csr.field.pend, and possibly also shorten it to csr.f.pend.

Couple of random thoughts;

  • Divorcing the CSRs from the bus they are implimented on sounds great.

  • Being able to output SVD format would be super awesome.

  • It might make sense to seperate the code to generate accessors from the RTL totally. Currently C and Rust are popular output, but MicroPython / Circuit Python are likely to be popular in the future. There might also be multiple styles of C accessors you want to generate (IE optimised for dynamic configuration verse size verse speed, etc). They could all just take the SVD, JSON or CSV file and work from that?

  • The CSR bus uses less resources inside an FPGA than wishbone, so it would be good to keep that. (Lots of wishbone slaves can get very expensive...).

  • The code which collects all the CSRs found under a given heirachy should be common code.

  • Many CPUs will want to just use MMIO. That should be trivial for a CPU to use without thinking.

Hopefully that was somewhat coherent.

  • It might make sense to seperate the code to generate accessors from the RTL totally.

The code generator will take an abstract model of the registers, yes. It doesn't make any sense to always roundtrip it through JSON, but the abstract model should be bidirectionally convertible to/from canonical JSON representation.

  • The CSR bus uses less resources inside an FPGA than wishbone, so it would be good to keep that.

The CSR bus doesn't support atomic updates, which is a significant problem. That would need to be solved.

  • The code which collects all the CSRs found under a given heirachy should be common code.
  • Many CPUs will want to just use MMIO. That should be trivial for a CPU to use without thinking.

Of course. Are there any particular considerations for MMIO you have that I'm missing?

The CSR bus uses less resources inside an FPGA than wishbone, so it would be good to keep that. (Lots of wishbone slaves can get very expensive...).

It that because of the "OR-multiplexing" trick done in CSR but not WB, or are there other reasons?
If it's the former and that is really important, we can add a RE signal to the CSR bus.

Why can't the OR-multiplexing trick be done inside the CSR WB adapter, anyway?

I have no actual proof here, so it would be good for someone to do some real analysis rather than my anecdotes.

There might be some useful information @ https://zipcpu.com/zipcpu/2019/08/30/subbus.html

The things I have seen are;

  • Almost all peripherals have some type of CSR registers (I would guess average is like ~2 read, ~2 write).
  • The number of bits per CSR register is pretty low (IE 2-4 bits).
  • Very few peripherals require a full wishbone bus (less than 10%).
  • Things that require a wishbone bus are more likely to also want to be a wishbone master rather than "just" a wishbone slave.
  • Cost of having a large number of wishbone slaves seems large.
  • The cost of a 32bit wide bus verses an 8 wide bus that goes everywhere inside the FPGA is an important factor.

You can make a 8-bit Wishbone bus.

The only real differences between Wishbone and CSR with read-enable signal are:

  • The return lines on Wishbone require a multiplexer in the interconnect, with the selection signal driven by the interconnect. On the CSR bus, the target devices recognize their address, and output 0 whenever they are not selected. Then you can simply OR the return lines together (this is what I meant above by "OR-multiplexing trick").
  • Wishbone supports variable latency with the ack signal, CSR requires all targets to answer in one cycle (like a SRAM).

If that's important enough we can keep CSR, and simply add a read enable signal to support atomic reads (the protocol could be that reading the first byte in a multi-byte CSR makes the target copy the other bytes in that CSR to a buffer that is read from the bus instead of the current value).

A long time ago, someone said the read/write perspectives in MiSoC's CSRs are inconsistent (link). Would you guys think it's better to call all the read/write data and strobe signals from the CPU perspective?

@whitequark

  1. Debatable: I propose to unify CSRStatus and CSRStorage, and instead use per-bit writability. This is partly to simplify the interface (there is only one kind of CSR that needs to be dealt with), and partly so that atomic reset operations on flag bits would be possible using the r_c1 bit type.

Regarding per-bit read/write accessibility, I found an evidence to support such an idea. If we look at this SVD example (please find in page this string: <!-- SR: Status Register -->), with my own understanding some status fields like "overflowing" or "underflowing" are simply flags that can be unset without resetting the entire register. This means a "CSRStatus" register isn't necessarily read-only across all its fields. Does this explanation make sense?

Would you guys think it's better to call all the read/write data and strobe signals from the CPU perspective?

I agree that the CPU perspective makes more sense.

If we look at this SVD example (please find in page this string: <!-- SR: Status Register -->), with my own understanding some status fields like "overflowing" or "underflowing" are simply flags that can be unset without resetting the entire register. This means a "CSRStatus" register isn't necessarily read-only across all its fields. Does this explanation make sense?

Yes. That is one reason why I would like to unify CSRStatus and CSRStorage from the point of view of downstream users.

Since no one has shown any objection to this part of proposal, I think it does make sense to do that.

I've been working on lxsocdoc which can be inserted into a flow after verilog generation. An example of the output is available at https://rm.fomu.im/

This uses documented CSRs to generate field-level registers. For example, here's a status register https://github.com/im-tomu/valentyusb/blob/tri-fifo/valentyusb/usbcore/cpu/eptri.py#L536:

        self.status = status = CSRStatus(
            fields=[
                CSRField("have", description="`1` if there is data in the FIFO."),
                CSRField("is_in", description="`1` if an IN stage was detected."),
                CSRField("epno", 4, description="The destination endpoint for the most recent SETUP token."),
                CSRField("pend", description="`1` if there is an IRQ pending."),
                CSRField("data", description="`1` if a DATA stage is expected."),
            ],
            description="Status about the most recent `SETUP` transactions, and the state of the FIFO."
        )

This gets turned into a table and register map at https://rm.fomu.im/usb.html#usb-setup-status

It also gets turned into something functionally useful when it gets used:

        # Wire up the `STATUS` register
        self.comb += [
            status.fields.have.eq(buf.readable),
            status.fields.is_in.eq(is_in),
            status.fields.epno.eq(epno),
            status.fields.pend.eq(pending),
            status.fields.data.eq(have_data_stage),
        ]

In addition to CSR documentation, we now have ModuleDoc and AutoDoc. These are both in https://github.com/enjoy-digital/litex/blob/master/litex/soc/integration/doc.py

With ModuleDoc we can add elements to the current Module just like we add Signals (https://github.com/im-tomu/valentyusb/blob/tri-fifo/valentyusb/usbcore/cpu/eptri.py#L84). If a module inherits from AutoDoc then we can traverse all ModuleDoc elements within a Module and collect them all. One such example is the USB Debug Wishbone Bridge, which has its own protocol documented at https://github.com/im-tomu/valentyusb/blob/tri-fifo/valentyusb/usbcore/cpu/usbwishbonebridge.py#L19. Because this inherits AutoDoc, the protocol documentation gets included in the resulting output file: https://rm.fomu.im/usb.html#usb-wishbone-debug-protocol. This protocol would get included even if the USB Module had no CSRs at all, which is the case with DummyUSB -- if we added DummyUSB to a SoC then we'd still get the USB Wishbone Bridge documentation automatically added.

In addition to adding raw ModuleDoc objects, it is possible to inherit from ModuleDoc. If this happens, then a class' doc is used as the module documentation. An example of doing that can be seen in the SpiBone Wishbone-over-SPI bridge: https://github.com/xobs/spibone/blob/master/spibone.py#L109 Here we add a different ModuleDoc object depending on how many wires were requested, since the protocol changes with the number of wires. These protocol descriptions have their title and body inferred from the doc variable.

Finally, I had a request to allow separating documentation into a different file, to make it easy to edit it in an external editor. So you can pass file= to the ModuleDoc constructor.