fyubang

Fubang ZHAO's starred repositories

llms_paper

该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记（多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT）

21300

TigerBot

TigerBot: A multi-language multi-task LLM

Language:PythonApache-2.0222600

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

CC-BY-4.0285200

Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

1381300

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

MIT326100

InternLM

Official release of InternLM2.5 7B base and chat models. 1M context support

Language:PythonApache-2.0590600

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.0346800

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonApache-2.0101400

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Language:PythonMIT2342200

ChatGLM-Tuning

基于ChatGLM-6B + LoRA的Fintune方案

Language:PythonMIT371300

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03842000

docGPT

ChatGPT directly within Google Docs as an Editor Add-on 📑

Language:JavaScript65900

Instruction-Tuning-Papers

Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).

74200

torchscale

Foundation Architecture for (M)LLMs

Language:PythonMIT297900

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION953900

llama-dl

High-speed download of LLaMA, Facebook's 65B parameter GPT model

Language:ShellGPL-3.0416700

ChatGPT

Reverse engineered ChatGPT API

Language:PythonGPL-2.02799000

An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a default openai key in the tool, you can tell us if the key reach the limit.

Language:JavaScriptMIT16700

PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

Language:PythonApache-2.01180300

fyubang

Fubang ZHAO's starred repositories

llms_paper

TigerBot

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

Awesome-Chinese-LLM

MNBVC

InternLM

xtuner

nanotron

JARVIS

ChatGLM-Tuning

InstructionWild

ColossalAI

docGPT

Instruction-Tuning-Papers

torchscale

Megatron-LM

llama-dl

ChatGPT

GPT4IE

PaddleNLP

nlpcda

aliyun-odps-python-sdk

DocumentLayoutAnalysis

CasRel

datasets

BERT-NER

OpenAttack

TextAttack

DeepIE

cleanlab