DongDongBan / gemm-pybind-learning

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gemm-pybind-learning

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality. Present work uses modern CMake/Cuda and Yujia Zhai's GEMV implementention approach. CmakeLists comes from pkestene

Build and Install

This project requires CMake>=3.18, it can be built with code below:

git clone --recurse-submodules https://github.com/dongdongban/gemm-pybind-learning
cmake -S . -B build -DCMAKE_CUDA_ARCHITECTURES="75" && cd build
cmake --build .

The device architecture "sm_75" should be replaced by your native GPU capability.

Verification

if Nothing went wrong, check your module with these codes:

cd Optimizing-SGEMV-on-NVIDIA-GPUs
python -c 'import mygemm; mygemm.host(4096, 4096, 1); mygemm.host(4096, 4096, 2)'

About

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality.

License:GNU General Public License v3.0


Languages

Language:CMake 100.0%