renzibei / optimize-gemm

How to optimize sgemm in single-thread ARM cpu, mutli-threads ARM cpu and Nvidia gpu

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize Gemm

In this repo, we will show the code about how to optimize sgemm in single-thread ARM cpu, mutli-threads ARM cpu and Nvidia gpu.

In each subdirectory, use make to compile the program. And there will be a benchmark executable program to test the gemm. You can read the makefile files for detail.

About

How to optimize sgemm in single-thread ARM cpu, mutli-threads ARM cpu and Nvidia gpu


Languages

Language:C 87.3%Language:Assembly 5.8%Language:Python 5.1%Language:Makefile 1.7%