kangjian888 / FlooNoC

A Fast, Low-Overhead On-chip Network

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Logo

FlooNoC: A Fast, Low-Overhead On-chip Network

Logo

This repository provides modules for the FlooNoC, a Network-on-Chip (NoC) which is part of the PULP (Parallel Ultra-Low Power) Platform. The repository includes Network Interface IPs (named chimneys), Routers and further NoC components to build a complete NoC. FlooNoC mainly supports AXI4+ATOPs, but can be easily extended to other On-Chip protocols. Arbitrary topologies are supported with several routing algorithms. FlooNoC is designed to be scalable and modular, and can be easily extended with new components.

💡 Design Principles

Our NoC design is grounded in the following key principles:

  1. Full AXI4 Support: Our design fully supports AXI4+ATOPs from AXI5 as outlined here, particularly multiple outstanding burst transactions. It utilizes low-complexity routers and a decoupled link-level protocol to ensure scalability, thereby enabling tolerance to high-latency off-chip accesses.
  2. Decoupled Links and Networks: We use a link-level protocol that is decoupled from the network-level protocol. This allows us to move the complexity of the network-level protocol into the network interfaces, while deploying low-complexity routers in the network, that enable better scalability.
  3. Wide Physical Channels: We incorporate wide physical channels in order to meet the high-bandwidth requirements at network endpoints without being constrained by the operating frequency. This is in contrast to the traditional narrow link approach. Further, the NoC avoids any kind of serialization and sends entire messages in a single flit including header and tail information.
  4. Separation of traffic: Our design acknowledges the diversity in traffic patterns, as it decouples links and networks to handle wide, high-bandwidth, burst-based traffic and narrow, latency-sensitive traffic with separate physical channels.
  5. Modularity: Our design principles also emphasize modularity. We have developed a set of IPs that can be instantiated together to build a NoC. This approach not only promotes reusability but also facilitates flexibility in designing custom NoCs to cater to a variety of specific system requirements.

🔮 Origin of the name

The names of the IPs are inspired by the Harry Potter universe, where the Floo Network is a magical transportation system. The Network interfaces are named after the fireplaces and chimneys used to access the Floo Network.

In use for centuries, the Floo Network, while somewhat uncomfortable, has many advantages. Firstly, unlike broomsticks, the Network can be used without fear of breaking the International Statute of Secrecy. Secondly, unlike Apparition, there is little to no danger of serious injury. Thirdly, it can be used to transport children, the elderly and the infirm."

🔐 License

Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All hardware sources and tool scripts are licensed under the Solderpad Hardware License 0.51 (see LICENSE)

📚 Publication

If you use FlooNoC in your research, please cite the following paper:

FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic

@misc{fischer2023floonoc,
      title={FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic},
      author={Tim Fischer and Michael Rogenmoser and Matheus Cavalcante and Frank K. Gürkaynak and Luca Benini},
      year={2023},
      eprint={2305.08562},
      archivePrefix={arXiv},
      primaryClass={cs.AR}
}

⭐ Getting Started

Pre-requisites

FlooNoC uses bender to manage its dependencies and to automatically generate compilation scripts. Further Python >= 3.8 is required with the packages listed in requirements.txt.

Simulation

Currently, we do not provide any open-source simulation setup. Internally, the FlooNoC was tested using QuestaSim, which can be launched with the following command:

# Compile the sources
make compile-sim
# Run the simulation
make run-sim-batch VSIM_TB_DUT=tb_floo_dut

or in the GUI, with prepared waveforms:

# Compile the sources
make compile-sim
# Run the simulation
make run-sim VSIM_TB_DUT=tb_floo_dut

By replacing tb_floo_dut with the name of the testbench you want to simulate.

🧰 List of IPs

This repository includes the following NoC IPs:

  1. Routers: A collection of different NoC router designs with varying features such as virtual channels, input/output buffering, and adaptive routing algorithms.
  2. Network Interfaces (NIs): A set of NoC network interfaces for connecting IPs to the NoC.
  3. Topologies: A collection of NoC topologies, such as mesh, to enable the creation of various on-chip interconnects.
  4. Common IPs A set of IPs used by the NoC IPs, such as FIFOs, Cuts and arbiters.
  5. Verification IPs (VIPs): A set of VIPs to verify the correct functionality of the NoC IPs.
  6. Testbenches: A set of testbenches to evaluate the performance of the NoC IPs, including throughput, latency.

Routers

Name Description Doc
floo_router A simple router with configurable number of ports, physical and virtual channels, and input/output buffers
floo_narrow_wide_router Wrapper of a multi-link router for narrow and wide links

Network Interfaces

Name Description Doc
floo_axi_chimney A bidirectional network interface for connecting AXI4 Buses to the NoC
floo_narrow_wide_chimney A bidirectional network interface for connecting narrow & wide AXI Buses to the multi-link NoC

Topologies

Name Description Doc
floo_mesh A mesh topology with configurable number of rows and columns
floo_mesh_ruche A mesh topology with ruche channels and a configurable number of rows and columns

Common IPs

Name Description Doc
floo_fifo A FIFO buffer with configurable depth
floo_cut Elastic buffers for cuting timing paths
floo_cdc A Clock-Domain-Crossing (CDC) module implemented with a gray-counter based FIFO.
floo_wormhole_arbiter A wormhole arbiter
floo_vc_arbiter A virtual channel arbiter
floo_rob A table-based Reorder Buffer
floo_simple_rob A simplistic low-complexity Reorder Buffer
floo_rob_wrapper A wrapper of all available types of RoBs including RoB-less version

Verification IPs

Name Description Doc
axi_bw_monitor A AXI4 Bus Monitor for measuring the throughput and latency of the AXI4 Bus
axi_reorder_compare A AXI4 Bus Monitor for verifying the order of AXI transactions with the same ID
floo_axi_rand_slave A AXI4 Bus Multi-Slave generating random AXI respones with configurable response time
floo_axi_test_node A AXI4 Bus Master-Slave Node for generating random AXI transactions
floo_dma_test_node An endpoint node with a DMA master port and a Simulation Memory Slave port
floo_hbm_model A very simple model of the HBM memory controller with configurable delay

🎛️ Configuration

The data structs for the flits and the links are auto-generated and can be configured in util/*cfg.hjson. The size of the links is automatically determined to fit the largest message going over the link into a single flit, in order to avoid any serialization.

The AXI channels(s) needs to be configured in util/*cfg.hjson. The following example shows the configuration for a single AXI channel with 64-bit data width, 32-bit address width, 3-bit ID width, and 1-bit user width (beware that ID width can be different for input and output channels).

  axi_channels: [
    {name: 'axi', direction: 'input', params: {dw: 64, aw: 32, iw: 3, uw: 1 }},
  ]

Multiple physical links can be declared and the mapping of the AXI channels to the physical link can be configured in util/*cfg.json. The following example shows the configuration for two physical channels, one for requests and one for responses. The mapping of the AXI channels to the physical link is done by specifying the AXI channels in the map field.

  channel_mapping: {
    req: {axi: ['aw', 'w', 'ar']},
    rsp: {axi: ['b', 'r']}
  },

FlooNoC does not send any header and tail flits to avoid serilization overhead. Instead additional needed information is sent in parallel and can be specified with the header argument and the number of bits required. For instance, the rob_req field specifies if a responses needs to be reorderd. The rob_idx field specifies the index of the ROB that is used to track the outstanding requests. The dst_id & src_id fields specifies source and destination to route the packet. The last field specifies the last signal of the of a burst transfer used in wormhole routing.

  header: {
    rob_req: 1,
    rob_idx: 6,
    dst_id: 6,
    src_id: 6,
    last: 1,
    atop: 1,
  }

Finally, the package source files can be generated with:

make sources

About

A Fast, Low-Overhead On-chip Network

License:Other


Languages

Language:SystemVerilog 88.5%Language:Tcl 5.3%Language:Python 2.9%Language:Shell 1.1%Language:Mako 1.1%Language:Makefile 1.0%