fbv81bp / CPP_Matrix

Matrix computation performance benchmarks.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Matrix benchmarks in C/Cpp

I coded 4 benchmarks for computing matrix multiplication in C/Cpp:

  • A trivial version, where indexes just come one after the other as in a math book's explanation.
  • A cache efficient version, where by flipping the indexing FOR loops, I achieve fewer cache misses, by making the 1st index of the 2D B array flip slower.
  • The DSP oriented next one enables the code to be mapped into a multiply accumlate unit, but for that it sacrifices cache efficiency, because it has to multiply a line of A by a column of B, yet both are mapped row wise into memory.
  • And the last one uses a transposed version of the B matrix, mapped into memory by the columns of B first. Thus it can apply both previous improvements at the same time.

While the plain version without using heap is easier to follow, the one with heap can be applied to better distinguish between actual runtimes.

About

Matrix computation performance benchmarks.


Languages

Language:C++ 100.0%