wm901115nwpu's starred repositories

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:74Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

GVProf

GVProf: A Value Profiler for GPU-based Clusters

Language:PythonLicense:BSD-3-ClauseStargazers:42Issues:0Issues:0

GPA

GPU Performance Advisor

Language:PythonLicense:BSD-3-ClauseStargazers:55Issues:0Issues:0

triton-shared

Shared Middle-Layer for Triton Compilation

Language:MLIRLicense:MITStargazers:105Issues:0Issues:0

xllm

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

Language:PythonLicense:Apache-2.0Stargazers:353Issues:0Issues:0

TFMQ-DM

[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:25Issues:0Issues:0

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonLicense:MITStargazers:31758Issues:0Issues:0

taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

Language:C++License:NOASSERTIONStargazers:1216Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0
Language:PythonStargazers:31Issues:0Issues:0

unsloth

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:9801Issues:0Issues:0

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:839Issues:0Issues:0

resource-stream

CUDA related news and material links

License:MITStargazers:859Issues:0Issues:0

pytorch-model-train-template

pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用

Language:PythonStargazers:35Issues:0Issues:0

ring-attention

ring-attention experiments

Language:PythonLicense:Apache-2.0Stargazers:71Issues:0Issues:0

examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Language:PythonLicense:BSD-3-ClauseStargazers:21817Issues:0Issues:0

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:634Issues:0Issues:0

fusemix

Data-Efficient Multimodal Fusion on a Single GPU

Language:PythonStargazers:26Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2596Issues:0Issues:0

xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)

Language:C++License:NOASSERTIONStargazers:2308Issues:0Issues:0

vision

Datasets, Transforms and Models specific to Computer Vision

Language:PythonLicense:BSD-3-ClauseStargazers:15525Issues:0Issues:0

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:527Issues:0Issues:0

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonLicense:Apache-2.0Stargazers:420Issues:0Issues:0

MOSS-RLHF

MOSS-RLHF

Language:PythonLicense:Apache-2.0Stargazers:1182Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CLicense:MITStargazers:31821Issues:0Issues:0

ggml

Tensor library for machine learning

Language:CLicense:MITStargazers:9882Issues:0Issues:0
Language:HTMLLicense:MITStargazers:204Issues:0Issues:0

llama.cpp

llama 2 Inference

Language:CLicense:MITStargazers:20Issues:0Issues:0

calm

CUDA/Metal accelerated language model inference

Language:CLicense:MITStargazers:310Issues:0Issues:0