KyoohyungHan / Cuda_practice

Repository from Github https://github.comKyoohyungHan/Cuda_practiceRepository from Github https://github.comKyoohyungHan/Cuda_practice

Cuda_practice

Note

  1. First cudaMalloc is very slow (we have to do dummy cudaMalloc in implementation)
  2. cudaMemcpy from device to host is much slower than host to device (about 10 times difference)

About


Languages

Language:Cuda 100.0%