mengyi yan (authurlord)

authurlord

Geek Repo

Company:Beihang University

Location:Beijing, China

Github PK Tool:Github PK Tool

mengyi yan's starred repositories

PPOxFamily

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )

Language:PythonLicense:Apache-2.0Stargazers:1865Issues:0Issues:0

TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

Language:PythonLicense:MITStargazers:534Issues:0Issues:0

Chinese-LLM-Chat

大语言模型微调的项目,包含了使用QLora微调ChatGLM和LLama

Language:PythonLicense:Apache-2.0Stargazers:21Issues:0Issues:0

PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Language:PythonLicense:MITStargazers:694Issues:0Issues:0

rotom

Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond"

Language:RoffLicense:BSD-3-ClauseStargazers:20Issues:0Issues:0

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonLicense:Apache-2.0Stargazers:18041Issues:0Issues:0

reasoning-on-cots

Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"

Language:PythonLicense:MITStargazers:86Issues:0Issues:0

fm_data_tasks

Foundation Models for Data Tasks

Language:PythonStargazers:97Issues:0Issues:0
Language:PythonStargazers:195Issues:0Issues:0

zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Language:Jupyter NotebookLicense:MITStargazers:2758Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:28754Issues:0Issues:0

llm_rlhf

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

Language:PythonStargazers:24Issues:0Issues:0

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonLicense:Apache-2.0Stargazers:748Issues:0Issues:0

Lion

Code for "Lion: Adversarial Distillation of Proprietary Large Language Models (EMNLP 2023)"

Language:PythonLicense:MITStargazers:194Issues:0Issues:0

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookLicense:MITStargazers:9798Issues:0Issues:0

Claude_gptyier

Claude api 搭建的chatgpt网页

Language:JavaScriptStargazers:48Issues:0Issues:0

exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Language:PythonLicense:MITStargazers:2700Issues:0Issues:0

Tablet

The TABLET benchmark for evaluating instruction learning with LLMs for tabular prediction.

Language:PythonStargazers:18Issues:0Issues:0

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonLicense:MITStargazers:4419Issues:0Issues:0

CLONE_DK

使用聊天记录和播客文章,基于chatGLM-6B训练自己的数字克隆的方案实现,包括用到的脚本和最后部署成前端页面的代码

Language:PythonLicense:MITStargazers:236Issues:0Issues:0

mnn-llm

llm deploy project based mnn.

Language:C++License:Apache-2.0Stargazers:1392Issues:0Issues:0
Language:PythonStargazers:368Issues:0Issues:0

Vicuna-LangChain

A simple LangChain-like implementation based on Sentence Embedding+local knowledge base, with Vicuna (FastChat) serving as the LLM. Supports both Chinese and English, and can process PDF, HTML, and DOCX formats of documents as knowledge base.

Language:PythonLicense:Apache-2.0Stargazers:91Issues:0Issues:0

Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

Language:TypeScriptLicense:Apache-2.0Stargazers:30530Issues:0Issues:0

bedtimenews-archive-contents

睡前消息在线文稿内容仓库

Language:MarkdownStargazers:130Issues:0Issues:0

LaWGPT

🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型

Language:PythonLicense:GPL-3.0Stargazers:5763Issues:0Issues:0

zeus-llm-trainer

Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:66Issues:0Issues:0

auto-cot

Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1363Issues:0Issues:0

alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Language:PythonLicense:MITStargazers:103Issues:0Issues:0

ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Language:PythonLicense:Apache-2.0Stargazers:3643Issues:0Issues:0