ferrandi / PandA-bambu

PandA-bambu public repository

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong FSM in top.v when modifying FloPoCo operators

RaulMurillo opened this issue · comments

Hello! I'm experimenting with Bambu and FloPoCo.
I am trying to modify/update some FloPoCo operators within the ext/ folder (for more details, see my WIP fork https://github.com/RaulMurillo/PandA-bambu/tree/new_flopoco).

Among the many things I would need to change in order to make Bambu work, I am focused on just changing the functionality of such operators (modifying the .cpp inside ext/flopoco/src/ and re-compiling the whole project), and under this assumption, I was able to compile and install PandA-bambu.
However, one of the components within the top.v file generated by bambu is wrong. The FSM generated for the controller of the module is not updated to the new operator.
I mean, when I modify a FloPoCo component in such a way that the number of pipeline cycles is different from the original design, the corresponding FSM in module controller_XX contains a number of states that matches the pipeline depth of the previous FloPoCo operator, not the newest. As a consequence, performing any operation on the generated module, such as --simulate, results in failure in most of the cases, since the done_port is activated prematurely.

If I modify such FSM manually to match the number of states of the newest operator, then testbenches are successful.
I wonder if you could give me some pointers to update that part of the code. After some search in the sources I suspect that this FSM is related with src/HLS/architecture_creation/controller_creation/fsm_controller.cpp and the following functions, but I couldn't find where the pipeline of the module is read/passed to the FSM to generate the states and transitions.

DesignFlowStep_Status fsm_controller::InternalExec()

void fsm_controller::create_state_machine(std::string& parse)

I also noticed that in the report shown after executing $ bambu module.c:

  State Transition Graph Information of function mm:
    Number of states: 4
    Minimum number of cycles: 4
    Maximum number of cycles 4
    Done port is registered
  Time to perform creation of STG: 0.01 seconds

If I modify the corresponding FloPoCo operator, and make it, for example, a dummy identity operator (R <= X;), the report shows again 4 states for the STG, even the operation is purely combinational.

I might be wrong, but I suspect that the number of states for that FSM is not set properly, it seems like it is ad-hoc for those FloPoCo components.

Hi Raul,
FloPoCo has been integrated by characterizing the used operators in advance. Eucalyptus, a tool that comes with Bambu, is probing FloPoCo by running different syntheses with different requested frequencies (we use FloPoCo option -frequency). Then the obtained max frequency is stored in an XML file. We have an XML file for each target device. So, for example, in this XML file please look for fp_mult_expr_FU_32_32_32_200 component. The requested frequency is 200Mhz, but the obtained is 1000/14.948=66Mhz, and the operator is implemented as a combinational one. For 400Mhz, the synthesis says that the operator truly runs at 1000/6.784=147Mhz in 3 cycles and with an II of 1. Bambu now looks into the XML file and chooses the frequency that fits better for the specified clock period. So, if we would like a design running a 50Mhz, Bambu chooses fp_mult_expr_FU_32_32_32_200 while if we target 100Mhz, Bambu chooses component fp_mult_expr_FU_32_32_32_400. This has been done because it's challenging to control the FloPoCo components' timing.
So, to integrate new operators, you need to change the XML by correcting the cycles, initiation_time, and the stage_period.
Note that combinational components do not have "cycles, initiation_time, and the stage_period" but only execution_time field. Once you have updated this file, you do not need to make other changes to the rest of the codebase.

Concerning the FloPoCo code, I'm pretty curious about the change you made to the library. Is something coming from you or from the original library?
One of the issues about FloPoCo is its non-open source license. The version we used was released with an open-source license. Then the license changed to something like: "all right reserved" kind of license.

Hi Fabrizio,

Thank you for the pointers. However, it is not clear to me which values should be set to the cycles, initiation_time, and the stage_period. Should I use Eucalyptus for that? I couldn't find any documentation on how to use use it, apart from the --help option.

WRT the FloPoCo code, the library changes in the aforementioned branch are not mine, but from the official git repo https://gitlab.inria.fr/fdupont/flopoco. My purpose is not just to update FloPoCo within Bambu, but to provide an HLS flow for developing accelerators with new arithmetic formats (such as posit arithmetic, for which I developed certain operators in FloPoCo and included them in bambu here). I'm collaborating with Prof. Pilato on that.
Also, I am aware on the non-open source license of FloPoCo and the possible implications of that. However, the last I heard is they are trying to change it into a GPL-like for the next version.

Also, I realized that with that same example/device, the fp_mult_expr_FU_32_32_32_500 component has a higher stage_period (6.914) than the one for 400 MHz, which makes Bambu select that 500MHz component when the requested frequency is 100Mhz, rather than the 400 MHz one, which should also satisfy the timing requirements while providing lower area/power consumption.

Anyway, I still was not able to realize which is the method you used to get the stage_period for each component. Could you provide an example of commands for such synthesis with Eucaliptus?
Of course, the cycles parameter can be obtained directly from the VHDL file generated with FloPoCo under different frequency options, and I guess initiation_time should be 1 for the majority of situations.

Ok, so after understanding how the Eucalyptus tool works, I was able to run, for example, the following command to do a synthesis with the new operand:

eucalyptus --target-datafile=$PANDA_DIR/etc/devices/Xilinx_devices/xc7z020-1clg484-VVD-seed.xml --characterize=fp_plus_expr_FU-fp_plus_expr_FU_32_32_32_500

From those synthesis results, I can extract and replace the values mentioned in the XML file.

Thanks for the pointers!