KyoohyungHan / Cuda_practice

Repository from Github https://github.comKyoohyungHan/Cuda_practice

Cuda_practice

Note

First cudaMalloc is very slow (we have to do dummy cudaMalloc in implementation)
cudaMemcpy from device to host is much slower than host to device (about 10 times difference)

About

Languages

Language:Cuda 100.0%