tuteng0915 / cuda-floyd_warshall

CUDA implementation of the Blocked Floyd Warshall All pairs shortest path graph algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cuda Floyd Warshall implementation

CUDA implementation of the Blocked Floyd-Warshall All pairs shortest path graph algorithm based on article: "A Multi-Stage CUDA Kernel for Floyd-Warshall" (Ben Lund, Justin W. Smith)

Tested:

Hardware (MSI GP72 7RE):
  1. Processor: Intel(R) Core(TM) i7-7700HQ
  2. GPU: GTX 1050 Ti 2GB RAM
  3. RAM: 8GB RAM
Environment:
  1. System: Ubuntu 17.10
  2. NVCC: 9.1
  3. CC: 6.1

Performance results:

|V||E|fw.cpp fw-cuda.cu Speedup blocked-fw-cuda.cu Speedup
1000 150000 0.724s 0.176s 4.215x 0.117s 6.188x
2000 400000 5.895s 0.642s 9.182x 0.213s 27.67x
5000 500000 84.91s 8.406s 10.10x 2.301s 36.90x
7500 1125000 279.0s 28.31s 9.855x 7.550s 36.95x
10000 2000000 665.9s 63.81s 10.43x 17.28s 38.53x
12500 3125000 1269s 125.5s 10.11x 34.03s 37.27x

Compile source:

  1. Install make/nvcc
  2. Update makefile with path to nvcc compiler
  3. Run command: make all

Run tests:

  1. Install Python 3.6
  2. Go to project directory
  3. Run command: python3 -m unittest discover -s test -v

Author: Mateusz Bojanowski

About

CUDA implementation of the Blocked Floyd Warshall All pairs shortest path graph algorithm

License:MIT License


Languages

Language:Cuda 44.3%Language:Python 29.3%Language:C++ 21.1%Language:Makefile 5.3%