Beast code in Giters

wm901115nwpu's starred repositories

triton

Development repository for the Triton language and compiler

Language:C++MIT7400

asplos-tvm

Language:PythonApache-2.01100

GVProf

GVProf: A Value Profiler for GPU-based Clusters

Language:PythonBSD-3-Clause4200

GPA

GPU Performance Advisor

Language:PythonBSD-3-Clause5500

triton-shared

Shared Middle-Layer for Triton Compilation

Language:MLIRMIT10500

xllm

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

Language:PythonApache-2.035300

TFMQ-DM

[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Language:Jupyter NotebookApache-2.02500

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonMIT3175800

taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

Language:C++NOASSERTION121600

pscan_kernel

Language:Python300

pscan

Language:Python3100

unsloth

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.0980100

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookApache-2.083900

resource-stream

CUDA related news and material links

MIT85900

pytorch-model-train-template

pytorch单精度、半精度、混合精度、单卡、多卡（DP / DDP）、FSDP、DeepSpeed模型训练代码，并对比不同方法的训练速度以及GPU内存的使用

Language:Python3500

ring-attention

ring-attention experiments

Language:PythonApache-2.07100

examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Language:PythonBSD-3-Clause2181700

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonBSD-3-Clause63400

fusemix

Data-Efficient Multimodal Fusion on a Single GPU

Language:Python2600

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.0259600

xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)

Language:C++NOASSERTION230800

vision

Datasets, Transforms and Models specific to Computer Vision

Language:PythonBSD-3-Clause1552500

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonApache-2.052700

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.042000

MOSS-RLHF

Language:PythonApache-2.0118200

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CMIT3182100

ggml

Tensor library for machine learning

Language:CMIT988200

fix-posture

Language:HTMLMIT20400

llama.cpp

llama 2 Inference

Language:CMIT2000

calm

CUDA/Metal accelerated language model inference

Language:CMIT31000