jesunsahariar

followers

following

stars

Pacific Northwest National Laboratory

Seattle,WA

Jesun Sahariar Firoz's repositories

ACSpGEMM

Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on theGPU"

Language:CudaMIT000

bMatching

Language:C++BSD-3-Clause010

ColPack

A Graph Coloring Algorithm Package

Language:C++BSD-3-Clause010

Elemental

Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction

Language:C++NOASSERTION010

FBLAS

BLAS implementation for Intel FPGA

Language:C++BSD-3-Clause000

FlowGNN

A dataflow architecture for universal graph neural network inference via multi-queue streaming.

Language:C++000

fucking-algorithm

刷算法全靠套路，认准 labuladong 就够了！English version supported! Crack LeetCode, not only how, but also why.

000

gapbs

GAP Benchmark Suite

Language:C++NOASSERTION010

GraphBLAS

Materials for a GraphBLAS tutorial

Language:CNOASSERTION000

graphblast

High-Performance Linear Algebra-based Graph Primitives on GPUs

Language:C++Apache-2.0000

GSWITCH

A pattern-based algorithmic auto-tuner for graph processing on GPUs

Language:Cuda000

GSWITCH-1

A pattern-based algorithmic autotuner for graph processing on GPUs.

Language:Cuda000

gunrock

High-Performance Graph Primitives on GPUs

Language:CudaApache-2.0010

hornet

Hornet data structure for sparse dynamic graphs and matrices

Language:CudaBSD-3-Clause000

interviews

Everything you need to know to get the job.

Language:JavaMIT000

Lux

A Distributed Multi-GPU System for Fast Graph Processing

Language:CudaApache-2.0010

miniVite

Language:C++BSD-3-Clause010

moderngpu

Patterns and behaviors for GPU computing

Language:C++NOASSERTION000

nccl

Optimized primitives for collective multi-GPU communication

Language:CudaNOASSERTION010

osu-micro-benchmarks-5.3.2

ROCm - UCX enabled OSU_Benchmarks

Language:CNOASSERTION010

ppopp19-artifact

Artifact evaluation package for PPoPP 2019

Language:PythonApache-2.0010

push-pull

Code for paper "Implementing Push-Pull Efficiently in GraphBLAS" accepted to ICPP 2018

Language:C++Apache-2.0000

S-BLAS

This package includes the implementation for Sparse-Matrix-Vector-Multiplication (SpMV) and Sparse-Matrix-Matrix-Multiplication (SpMM) for Single-node Multi-GPU (scale-up) platforms such as NVIDIA DGX-1 and DGX-2.

Language:C++NOASSERTION000

sep-graph

This is the repo of "SEP-Graph: Finding Shortest Execution Paths for Graph Processing under a Hybrid Framework on GPU"

Language:CudaApache-2.0010

SHAD

Scalable High-performance Algorithms and Data-structures

Language:C++Apache-2.0000

SICM

Simplified Interface to Complex Memory

Language:CBSD-2-Clause010

spECK

Efficient SpGEMM on GPU using CUDA and CSR

Language:CudaMIT000

Tartan

Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite

Language:Cuda010

TC

Language:Cuda010

ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group):

Language:CNOASSERTION000