mengyi yan's starred repositories
PPOxFamily
PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )
Chinese-LLM-Chat
大语言模型微调的项目,包含了使用QLora微调ChatGLM和LLama
PPO-for-Beginners
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
reasoning-on-cots
Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"
fm_data_tasks
Foundation Models for Data Tasks
LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Claude_gptyier
Claude api 搭建的chatgpt网页
zero_shot_cot
Prod Env
Vicuna-LangChain
A simple LangChain-like implementation based on Sentence Embedding+local knowledge base, with Vicuna (FastChat) serving as the LLM. Supports both Chinese and English, and can process PDF, HTML, and DOCX formats of documents as knowledge base.
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
bedtimenews-archive-contents
睡前消息在线文稿内容仓库
zeus-llm-trainer
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
alpaca-rlhf
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
ChatGLM-Efficient-Tuning
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调