smart's starred repositories

cuda_reduce

Trying out the reduction kernels from The CUDA Handbook by Nicolas Wilt and Mark Harris' (NVIDIA) reduction kernels

Language:C++Stargazers:2Issues:0Issues:0

cudahandbook

Source code that accompanies The CUDA Handbook.

Language:CudaStargazers:489Issues:0Issues:0

xilinx-ethash

Run ethash opencl kernel on Xilinx's Alveo U50

Language:CStargazers:17Issues:0Issues:0

Batched-SpMM

New batched algorithm for sparse matrix-matrix multiplication (SpMM)

Language:CudaLicense:MITStargazers:15Issues:0Issues:0

SpDNN_Challenge2020

Codebase for the 2020 Graph Challenge

Language:CudaStargazers:6Issues:0Issues:0

Graphchallenge21

graph challenge 2021

Language:CudaStargazers:27Issues:0Issues:0

dgSPARSE-Lib

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Language:CudaLicense:MITStargazers:96Issues:0Issues:0
Language:CudaLicense:Apache-2.0Stargazers:5Issues:0Issues:0
Language:CudaLicense:BSD-3-ClauseStargazers:85Issues:0Issues:0

ASpT-mirror

Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding

Language:C++Stargazers:11Issues:0Issues:0

merge-spmm

Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018

Language:C++License:Apache-2.0Stargazers:68Issues:0Issues:0

spECK

Efficient SpGEMM on GPU using CUDA and CSR

Language:CudaLicense:MITStargazers:50Issues:0Issues:0
Language:CudaStargazers:1Issues:0Issues:0

SpMV-on-Many-Core

A cross-platform Sparse Matrix Vector Multiplication (SpMV) framework for many-core architectures (GPUs and Xeon Phi).

Language:C++License:GPL-3.0Stargazers:7Issues:0Issues:0
Language:CudaLicense:MITStargazers:95Issues:0Issues:0

myGEMM

Code appendix to an OpenCL matrix-multiplication tutorial

Language:CLicense:MITStargazers:160Issues:0Issues:0
Language:CStargazers:38Issues:0Issues:0
Stargazers:1Issues:0Issues:0

maxas

Assembler for NVIDIA Maxwell architecture

Language:SassLicense:MITStargazers:936Issues:0Issues:0

tSparse

A GPU algorithm for sparse matrix-matrix multiplication

Language:CudaLicense:Apache-2.0Stargazers:64Issues:0Issues:0

gcoospdm

Sparse-dense matrix-matrix multiplication on GPUs

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

sputnik

A library of GPU kernels for sparse matrix operations.

Language:C++License:Apache-2.0Stargazers:239Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:17Issues:0Issues:0

cutlass_tilesparse

CUDA templates for tile-sparse matrix multiplication based on CUTLASS.

Language:C++License:BSD-3-ClauseStargazers:46Issues:0Issues:0

blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Language:CudaLicense:MITStargazers:1018Issues:0Issues:0
Language:CudaStargazers:2061Issues:0Issues:0

cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Language:CudaStargazers:500Issues:0Issues:0

AmpereSparseMatmul

study of Ampere' Sparse Matmul

Language:CudaLicense:MITStargazers:13Issues:0Issues:0

torch-blocksparse

Block-sparse primitives for PyTorch

Language:PythonLicense:MITStargazers:145Issues:0Issues:0
Language:C++Stargazers:139Issues:0Issues:0