update vkpeak 20230812

Question

update vkpeak 20230812

nihui opened this issue a year ago · comments

https://github.com/nihui/vkpeak/releases/tag/20230812

fp16-matrix value added for all VK_KHR_cooperative_matrix capable devices, such as rtx20+ and rdna3
It reflects the computing power of tensorcore or similar AI engine on the device

At the moment, all nvidia turing+ devices are known to work
rdna3 device works with the latest windows driver (130Tflops+ measured on my 7900xtx graphic)

In the future, the linux mesa driver will follow up, bring this extension for intel etc.

sample output on nvidia t4

[action@VM-116-181-centos build]$ ./vkpeak 0
device       = GRID T4-8C

fp32-scalar  = 3823.95 GFLOPS
fp32-vec4    = 3796.63 GFLOPS

fp16-scalar  = 3599.11 GFLOPS
fp16-vec4    = 7203.46 GFLOPS
fp16-matrix  = 29188.25 GFLOPS

fp64-scalar  = 127.15 GFLOPS
fp64-vec4    = 127.13 GFLOPS

int32-scalar = 3667.11 GIOPS
int32-vec4   = 3741.25 GIOPS

int16-scalar = 3707.29 GIOPS
int16-vec4   = 3797.13 GIOPS