This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. The reduce optimization has been completed. The optimization of GEMM has completed the CUDA C code. The assembler is currently being used to tune the code, and the code will be issued later.
This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. The reduce optimization has been completed. The optimization of GEMM has completed the CUDA C code. The assembler is currently being used to tune the code, and the code will be issued later.
Apache License 2.0