66RING

66RING's repositories

tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass

Language:Cuda161 3 8

dotfiles

My dotfiles

Language:Shell17 30

LongShortTokenDecoding

Long short token decoding speed up 4x for long context LLM. A hundred lines of core code. Open source for learning.

Language:Python600

ring-attention-pytorch

tiny ring attention implement for learning purpose

Language:Python4 3 1

scripts

some scripts

Language:Shell3 3 1

66RING.github.io

https://66ring.github.io/

Language:HTML2 30

Counting-Stars-Local

Counting-Stars scripts for evaluating local llm.

Language:Python200

Notes

my note things

2 30

LLMTest_NeedleInAHaystack-Local

run Needle In A Haystack with local LLM. check the makefile

Language:PythonNOASSERTION100

pytorch-cuda-binding-tutorial

Tutorial for building a custom CUDA and C function for torch

Language:PythonMIT1 30

single-gpu-passthrough

Language:Shell1 20

the-chaos

Language:Rust1 20

15445-bootcamp

A basic introduction to coding in modern C++.

Language:C++Apache-2.0010

66RING

03 1

academicpages.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptMIT010

bufferline.nvim

A snazzy bufferline for Neovim

Language:LuaUnlicense010

clash-verge

A Clash GUI based on tauri. Supports Windows, macOS and Linux.

Language:TypeScriptGPL-3.0010

ContinuousBatching

A demo about continuous batching, which is simple than you think.

000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause010

LightSeq

Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Language:Python000

llama-playground

play with llama

Language:Python000

minitorch

The full minitorch student suite.

Language:Python010

paperDB

020

PoSE

Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length

Language:PythonMIT010

RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Apache-2.0000

st

my st

Language:CMIT030

ThunderKittens

Tile primitives for speedy kernels

MIT000

vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

MIT000

wallpapers

030

zephyr-nvim

Customized nvimdev/zephyr-nvim

Language:LuaMIT010