wm901115nwpu's starred repositories

taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

Language:C++License:NOASSERTIONStargazers:1220Issues:0Issues:0
Language:PythonStargazers:31Issues:0Issues:0

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:11263Issues:0Issues:0

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1037Issues:0Issues:0

resource-stream

CUDA related news and material links

License:MITStargazers:897Issues:0Issues:0

pytorch-model-train-template

pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用

Language:PythonStargazers:44Issues:0Issues:0

ring-attention

ring-attention experiments

Language:PythonLicense:Apache-2.0Stargazers:74Issues:0Issues:0

examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Language:PythonLicense:BSD-3-ClauseStargazers:21904Issues:0Issues:0

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:648Issues:0Issues:0

fusemix

Data-Efficient Multimodal Fusion on a Single GPU

Language:PythonStargazers:31Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2733Issues:0Issues:0

xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)

Language:C++License:NOASSERTIONStargazers:2353Issues:0Issues:0

vision

Datasets, Transforms and Models specific to Computer Vision

Language:PythonLicense:BSD-3-ClauseStargazers:15620Issues:0Issues:0

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:544Issues:0Issues:0

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonLicense:Apache-2.0Stargazers:488Issues:0Issues:0

MOSS-RLHF

MOSS-RLHF

Language:PythonLicense:Apache-2.0Stargazers:1194Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CLicense:MITStargazers:32394Issues:0Issues:0

ggml

Tensor library for machine learning

Language:CLicense:MITStargazers:10084Issues:0Issues:0
Language:HTMLLicense:MITStargazers:204Issues:0Issues:0

llama.cpp

llama 2 Inference

Language:CLicense:MITStargazers:22Issues:0Issues:0

calm

CUDA/Metal accelerated language model inference

Language:CLicense:MITStargazers:326Issues:0Issues:0
Language:C++License:MITStargazers:1230Issues:0Issues:0

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Language:PythonLicense:Apache-2.0Stargazers:534Issues:0Issues:0

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonLicense:BSD-3-ClauseStargazers:3382Issues:0Issues:0

skhd

Simple hotkey daemon for macOS

Language:CLicense:MITStargazers:5736Issues:0Issues:0

long-context-attention

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Language:PythonStargazers:173Issues:0Issues:0

QuaRot

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Language:PythonLicense:Apache-2.0Stargazers:163Issues:0Issues:0

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:17899Issues:0Issues:0

JORA

JORA: JAX Tensor-Parallel LoRA Library

Language:PythonLicense:NOASSERTIONStargazers:20Issues:0Issues:0

torchpipe

Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends

Language:C++License:Apache-2.0Stargazers:133Issues:0Issues:0