Jesun Sahariar Firoz's repositories
ACSpGEMM
Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on theGPU"
Elemental
Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction
FBLAS
BLAS implementation for Intel FPGA
FlowGNN
A dataflow architecture for universal graph neural network inference via multi-queue streaming.
fucking-algorithm
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
GraphBLAS
Materials for a GraphBLAS tutorial
graphblast
High-Performance Linear Algebra-based Graph Primitives on GPUs
GSWITCH
A pattern-based algorithmic auto-tuner for graph processing on GPUs
GSWITCH-1
A pattern-based algorithmic autotuner for graph processing on GPUs.
gunrock
High-Performance Graph Primitives on GPUs
hornet
Hornet data structure for sparse dynamic graphs and matrices
interviews
Everything you need to know to get the job.
moderngpu
Patterns and behaviors for GPU computing
nccl
Optimized primitives for collective multi-GPU communication
osu-micro-benchmarks-5.3.2
ROCm - UCX enabled OSU_Benchmarks
ppopp19-artifact
Artifact evaluation package for PPoPP 2019
push-pull
Code for paper "Implementing Push-Pull Efficiently in GraphBLAS" accepted to ICPP 2018
S-BLAS
This package includes the implementation for Sparse-Matrix-Vector-Multiplication (SpMV) and Sparse-Matrix-Matrix-Multiplication (SpMM) for Single-node Multi-GPU (scale-up) platforms such as NVIDIA DGX-1 and DGX-2.
sep-graph
This is the repo of "SEP-Graph: Finding Shortest Execution Paths for Graph Processing under a Hybrid Framework on GPU"
SHAD
Scalable High-performance Algorithms and Data-structures
SICM
Simplified Interface to Complex Memory
spECK
Efficient SpGEMM on GPU using CUDA and CSR
Tartan
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite
ucx
Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group):