luoyingfeng's starred repositories

LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language:PythonLicense:MITStargazers:495Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:6Issues:0Issues:0

Poetry

爬取自互联网的古诗词语料库,包含先秦至当代诗词,共计1014508首诗

Stargazers:25Issues:0Issues:0

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

License:CC0-1.0Stargazers:16328Issues:0Issues:0

promptsource

Toolkit for creating, sharing and using natural language prompts.

Language:PythonLicense:Apache-2.0Stargazers:2603Issues:0Issues:0

yanmtt

Yet Another Neural Machine Translation Toolkit

Language:PythonLicense:MITStargazers:170Issues:0Issues:0

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4164Issues:0Issues:0

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29874Issues:0Issues:0

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:18284Issues:0Issues:0

OPUS

The Open Parallel Corpus

Language:JavaScriptStargazers:51Issues:0Issues:0

Introduction-to-Transformers

An introduction to basic concepts of Transformers and key techniques of their recent advances.

License:CC0-1.0Stargazers:43Issues:0Issues:0

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18433Issues:0Issues:0

olm-datasets

Pipeline for pulling and processing online language model pretraining data from the web

Language:PythonLicense:Apache-2.0Stargazers:170Issues:0Issues:0

natural-instructions

Expanding natural instructions

Language:PythonLicense:Apache-2.0Stargazers:926Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:35873Issues:0Issues:0

TigerBot

TigerBot: A multi-language multi-task LLM

Language:PythonLicense:Apache-2.0Stargazers:2225Issues:0Issues:0

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

License:MITStargazers:3252Issues:0Issues:0

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:7741Issues:0Issues:0

ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Language:PythonLicense:NOASSERTIONStargazers:18009Issues:0Issues:0

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4571Issues:0Issues:0

CPM-Live

Live Training for Open-source Big Models

Language:PythonStargazers:510Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:9016Issues:0Issues:0

awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT better.

Language:HTMLLicense:CC0-1.0Stargazers:107472Issues:0Issues:0

textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。

Language:PythonLicense:Apache-2.0Stargazers:915Issues:0Issues:0

TextBox

TextBox 2.0 is a text generation library with pre-trained language models

Language:PythonLicense:MITStargazers:1068Issues:0Issues:0

EDA_NLP_for_Chinese

An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

Language:PythonStargazers:1337Issues:0Issues:0

Final_word_Similarity

综合了同义词词林扩展版与知网(Hownet)的词语相似度计算方法,词汇覆盖更多、结果更准确。

Language:PythonLicense:MITStargazers:713Issues:0Issues:0

CDial-GPT

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

Language:PythonLicense:MITStargazers:1734Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:425Issues:0Issues:0

simpleGAN

simple gan for fake anime generator

Language:PythonStargazers:1Issues:0Issues:0