Beast code in Giters

smart's starred repositories

cuda_reduce

Trying out the reduction kernels from The CUDA Handbook by Nicolas Wilt and Mark Harris' (NVIDIA) reduction kernels

Language:C++200

cudahandbook

Source code that accompanies The CUDA Handbook.

Language:Cuda48900

xilinx-ethash

Run ethash opencl kernel on Xilinx's Alveo U50

Language:C1700

Batched-SpMM

New batched algorithm for sparse matrix-matrix multiplication (SpMM)

Language:CudaMIT1500

SpDNN_Challenge2020

Codebase for the 2020 Graph Challenge

Language:Cuda600

Graphchallenge21

graph challenge 2021

Language:Cuda2700

dgSPARSE-Lib

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Language:CudaMIT9600

dgSPARSE-Library

Language:CudaApache-2.0500

merge-spmv

Language:CudaBSD-3-Clause8500

ASpT-mirror

Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding

Language:C++1100

merge-spmm

Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018

Language:C++Apache-2.06800

spECK

Efficient SpGEMM on GPU using CUDA and CSR

Language:CudaMIT5000

cublas_dgemm_batched

Language:Cuda100

SpMV-on-Many-Core

A cross-platform Sparse Matrix Vector Multiplication (SpMV) framework for many-core architectures (GPUs and Xeon Phi).

Language:C++GPL-3.0700

ge-spmm

Language:CudaMIT9500

myGEMM

Code appendix to an OpenCL matrix-multiplication tutorial

Language:CMIT16000

batched_gemm

Language:C3800

Batched_gemm

100

maxas

Assembler for NVIDIA Maxwell architecture

Language:SassMIT93600

tSparse

A GPU algorithm for sparse matrix-matrix multiplication

Language:CudaApache-2.06400

gcoospdm

Sparse-dense matrix-matrix multiplication on GPUs

Language:PythonMIT1300

sputnik

A library of GPU kernels for sparse matrix operations.

Language:C++Apache-2.023900

gpu-sparsert

Language:Jupyter NotebookMIT1700

cutlass_tilesparse

CUDA templates for tile-sparse matrix multiplication based on CUTLASS.

Language:C++BSD-3-Clause4600

blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Language:CudaMIT101800

CUDA_Freshman

Language:Cuda206100

cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Language:Cuda50000

AmpereSparseMatmul

study of Ampere' Sparse Matmul

Language:CudaMIT1300

torch-blocksparse

Block-sparse primitives for PyTorch

Language:PythonMIT14500

TileSparsity

Language:C++13900