lipengyue's starred repositories
cuOpt-Resources
A collection of NVIDIA cuOpt samples and other resources
DeepSeek-Coder
DeepSeek Coder: Let the Code Write Itself
gpu-operator
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
open-webui
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
clash-for-linux-backup
基于Clash Core 制作的Clash For Linux备份仓库 A Clash For Linux Backup Warehouse Based on Clash Core
clash-for-AutoDL
AutoDL平台服务器适配梯子, 使用 Clash 作为代理工具
inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
ColossalAI
Making large AI models cheaper, faster and more accessible
transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
k8s-device-plugin
NVIDIA device plugin for Kubernetes
Megatron-LM
Ongoing research training transformer models at scale
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
kubesphere
The container platform tailored for Kubernetes multi-cloud, datacenter, and edge management ⎈ 🖥 ☁️
drawio-desktop
Official electron build of draw.io
LLM-quickstart
Quick Start for Large Language Models (Theoretical Learning and Practical Fine-tuning) 大语言模型快速入门(理论学习与微调实战)