cuda-mode / resource-stream

CUDA related news and material links

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA MODE Resource Stream

https://discord.gg/cudamode

Here you find a collection of CUDA related material (books, papers, blog-post, youtube videos, tweets, implementations etc.). We also collect information to higher level tools for performance optimization and kernel development like Triton and torch.compile() ... whatever makes the GPUs go brrrr.

You know a great resource we should add? Please see How to contribute.

Lectures / Reading Group Live Sessions

You find a list of upcoming lectures in the Events option in the channel list (side bar) of our discord server.

Recordings of the weekly lectures are published on our YouTube channel. Material (code, slides) for the individual lectures can be found in the lectures repository.

1st Contact with CUDA

2nd Contact

Papers, Case Studies

Books

Cuda Courses

CUDA Grandmasters

Tri Dao

Tim Dettmers

Sascha Rush

Practice

PyTorch Performance Optimization

PyTorch Internals & Debugging

Code / Libs

Essentials

Profiling

Python GPU Computing

Advanced Topics, Research, Compilers

News

Technical Blog Posts

Hardware Architecture

CUDA-MODE Community Projects

ring-attention

pscan

Triton Kernels / Examples

  • unsloth that implements custom kernels in Triton for faster QLoRA training
  • Custom implementation of relative position attention (link)
  • Tri Dao's Triton implementation of Flash Attention: flash_attn_triton.py
  • YouTube playlist: Triton Conference 2023
  • LightLLM with different triton kernels for different LLMs

How to contribute

To share interesting CUDA related links please create a pull request for this file. See editing files in the github documentation.

Or contact us on the CUDA MODE discord server: https://discord.gg/cudamode

About

CUDA related news and material links

License:MIT License