SuMeng123's starred repositories

LivePortrait

Bring portraits to life!

Language:PythonLicense:NOASSERTIONStargazers:12055Issues:0Issues:0

learn-nlp-with-transformers

we want to create a repo to illustrate usage of transformers in chinese

Language:ShellStargazers:2193Issues:0Issues:0

Firefly

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Language:PythonStargazers:5694Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language:PythonStargazers:10114Issues:0Issues:0

uniem

unified embedding model

Language:PythonLicense:Apache-2.0Stargazers:821Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:2099Issues:0Issues:0

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:31787Issues:0Issues:0

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8226Issues:0Issues:0

Llama-Chinese

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

Language:PythonStargazers:13724Issues:0Issues:0

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

License:MITStargazers:3414Issues:0Issues:0

Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:4079Issues:0Issues:0

XVERSE-13B

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.

Language:PythonLicense:Apache-2.0Stargazers:647Issues:0Issues:0

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15702Issues:0Issues:0

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonLicense:Apache-2.0Stargazers:2056Issues:0Issues:0

Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Language:PythonLicense:Apache-2.0Stargazers:5670Issues:0Issues:0
Stargazers:875Issues:0Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1850Issues:0Issues:0

FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Language:PythonLicense:Apache-2.0Stargazers:3816Issues:0Issues:0

Safety-Prompts

Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。

License:Apache-2.0Stargazers:852Issues:0Issues:0

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:1311Issues:0Issues:0

Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集

Language:PythonStargazers:3028Issues:0Issues:0

RLHF

Implementation of Chinese ChatGPT

Language:PythonStargazers:283Issues:0Issues:0

transformers_tasks

⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.

Language:Jupyter NotebookStargazers:2125Issues:0Issues:0

Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca

Language:CLicense:Apache-2.0Stargazers:4143Issues:0Issues:0

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:40461Issues:0Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29387Issues:0Issues:0

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:7841Issues:0Issues:0

EVA

EVA: Large-scale Pre-trained Chit-Chat Models

Language:PythonLicense:MITStargazers:305Issues:0Issues:0

CSTS

中文自然语言推理与语义相似度数据集

Stargazers:336Issues:0Issues:0

Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。

Language:PythonLicense:Apache-2.0Stargazers:4006Issues:0Issues:0