Junesoo Kang's starred repositories
ringattention
Transformers with Arbitrarily Large Context
gds-nvidia-fs
NVIDIA GPUDirect Storage Driver
pytorch-direct_dgl
PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)
IGB-Datasets
Largest realworld open-source graph dataset - Worked done under IBM-Illinois Discovery Accelerator Institute and Amazon Research Awards and in collaboration with NVIDIA Research.
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
graph-based-deep-learning-literature
links to conference publications in graph-based deep learning
DUCATI_SIGMOD
Accepted paper of SIGMOD 2023, DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU
matmulfreellm
Implementation for MatMul-free LM.
dgSPARSE-Lib
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
ThunderKittens
Tile primitives for speedy kernels
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
cudnn.torch
Torch-7 FFI bindings for NVIDIA CuDNN