First attemp to design and implement a matrix product algorithm in C for FPGA (embebed system). Using Xilinx Vivado HLS
- Xilinx Vivado HLS Webpack (available for free in Xilinx web) or Vivado paid versions.
- source - C code and headers (.h).
- test - Test C code.
- pictures_statistics - pictures with simulated results statistics.
A classic implementation of matrix multiplication.
- A[M][L] * B[L][N] = R[M][N]
One of the most common directives when talking about RTL synthesis or High-level synthesis in general, is pipeline. More information about directives is available in Xilinx Documentation
To show an example, below you can see the latency without using any directive:
Latency is then 1621[clock cycles].
Xilinx Doc.: Latency is defined as the number of clock cycles required to produce an output.
Now, specify a directive for loop 'lazoFilas' (image below).
Latency improved in comparison with the first result, getting as a latency of 155[clock cycles] (image below).
latency_NoPipeline = 1621 [clock cycles]
latency_Pipeline = 155 [clock cycles]
change = 1621/155 = 10.458 times smaller
So it is a good change to get to know how to use these directives in order to meet requirements.