CUDA MNIST denoising

MNIST denoising auto-encoder on CUDA from scratch

Low (28x28) and high (256x256) resolution examples:

Requirements

Tested with CUDA 11.6, NVCC 11.6, Ubuntu 20.04

cd project_folder
nvcc -O2 kernel.cu main.cpp -o denoiser

./denoiser {input_img_path} [optional -benchmark N]
if N > 0 runs multiple times and measure time

Trainig in Model training.ipynb follows https://keras.io/examples/vision/autoencoder/ with added weights export to binary file.

Img shape: 28x28x1
Num runs: 1000
Total GPU time: 6034.27 ms
AVG one forward pass GPU time: 6.03427 ms

Img shape: 28x28x1
Num runs: 1000
Total GPU time: 5998.17 ms
AVG one forward pass GPU time: 5.99817 ms

Img shape: 28x28x1
Num runs: 1000
Total GPU time: 3673.7 ms
AVG one forward pass GPU time: 3.6737 ms

MNIST denoising auto-encoder on CUDA from scratch

MIT License

Language:C 51.7%Language:Jupyter Notebook 45.1%Language:Cuda 2.9%Language:C++ 0.3%