authurlord

mengyi yan's starred repositories

PPOxFamily

PPO x Family DRL Tutorial Course（决策智能入门级公开课：8节课帮你盘清算法理论，理顺代码逻辑，玩转决策AI应用实践）

Language:PythonApache-2.0186500

TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

Language:PythonMIT53400

Chinese-LLM-Chat

大语言模型微调的项目，包含了使用QLora微调ChatGLM和LLama

Language:PythonApache-2.02100

PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Language:PythonMIT69400

rotom

Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond"

Language:RoffBSD-3-Clause2000

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonApache-2.01804100

reasoning-on-cots

Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"

Language:PythonMIT8600

fm_data_tasks

Foundation Models for Data Tasks

Language:Python9700

zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Language:Jupyter NotebookMIT275800

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02875400

llm_rlhf

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

Language:Python2400

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonApache-2.074800

Lion

Code for "Lion: Adversarial Distillation of Proprietary Large Language Models (EMNLP 2023)"

Language:PythonMIT19400

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookMIT979800

Claude_gptyier

Claude api 搭建的chatgpt网页

Language:JavaScript4800

exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Language:PythonMIT270000

Tablet

The TABLET benchmark for evaluating instruction learning with LLMs for tabular prediction.

Language:Python1800

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT441900

CLONE_DK

使用聊天记录和播客文章，基于chatGLM-6B训练自己的数字克隆的方案实现，包括用到的脚本和最后部署成前端页面的代码

Language:PythonMIT23600

mnn-llm

llm deploy project based mnn.

Language:C++Apache-2.0139200

A simple LangChain-like implementation based on Sentence Embedding+local knowledge base, with Vicuna (FastChat) serving as the LLM. Supports both Chinese and English, and can process PDF, HTML, and DOCX formats of documents as knowledge base.

Language:PythonApache-2.09100

Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

Language:TypeScriptApache-2.03053000

bedtimenews-archive-contents

睡前消息在线文稿内容仓库

Language:Markdown13000

LaWGPT

🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型

Language:PythonGPL-3.0576300

zeus-llm-trainer

Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models

Language:PythonApache-2.06600

auto-cot

Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)

Language:Jupyter NotebookApache-2.0136300

alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Language:PythonMIT10300

ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Language:PythonApache-2.0364300