Ruilin Zhao's starred repositories

Language:PythonStargazers:27Issues:0Issues:0

fastapi

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Language:PythonLicense:MITStargazers:69867Issues:0Issues:0

parallel-decoding

Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"

Language:PythonLicense:Apache-2.0Stargazers:87Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:938Issues:0Issues:0

ZLUDA

CUDA on AMD GPUs

Language:RustLicense:Apache-2.0Stargazers:7337Issues:0Issues:0

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:9322Issues:0Issues:0

nccl-tests

NCCL Tests

Language:CudaLicense:BSD-3-ClauseStargazers:637Issues:0Issues:0

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15355Issues:0Issues:0

Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:388Issues:0Issues:0

PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画

Language:PythonLicense:Apache-2.0Stargazers:687Issues:0Issues:0
Language:PythonStargazers:1478Issues:0Issues:0

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Language:PythonLicense:Apache-2.0Stargazers:1353Issues:0Issues:0

modulus

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

Language:PythonLicense:Apache-2.0Stargazers:597Issues:0Issues:0

Llama-Chinese

Llama中文社区,最好的中文Llama大模型,完全开源可商用

Language:PythonStargazers:8980Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:2087Issues:0Issues:0

Model-References

TensorFlow and PyTorch Reference models for Gaudi(R)

Language:PythonStargazers:128Issues:0Issues:0

Firefly-LLaMA2-Chinese

Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型

Language:PythonStargazers:345Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:53855Issues:0Issues:0

llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8062Issues:0Issues:0

GitHub-Chinese-Top-Charts

:cn: GitHub中文排行榜,各语言分设「软件 | 资料」榜单,精准定位中文好项目。各取所需,高效学习。

Language:JavaLicense:NOASSERTIONStargazers:86972Issues:0Issues:0

Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

Language:PythonLicense:MITStargazers:2212Issues:0Issues:0

mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Language:PythonLicense:MITStargazers:2190Issues:0Issues:0

QLLM

A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.

Language:PythonLicense:Apache-2.0Stargazers:68Issues:0Issues:0

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:NOASSERTIONStargazers:15526Issues:0Issues:0

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:26964Issues:0Issues:0

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Language:PythonLicense:Apache-2.0Stargazers:1658Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:3604Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1097Issues:0Issues:0

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonLicense:MITStargazers:972Issues:0Issues:0

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:11208Issues:0Issues:0