debowin / cuda-tiled-2D-convolution

Optimized Parallel Tiled Approach to perform 2D Convolution by taking advantage of the lower latency, higher bandwidth shared memory as well as global constant memory cached aggresively within GPU thread blocks.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

debowin/cuda-tiled-2D-convolution Stargazers