fanfannothing / DecryptPrompt

总结Prompt&LLM论文,开源数据&模型,AIGC应用

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DecryptPrompt

如果LLM的突然到来让你感到沮丧,不妨读下主目录的Choose Your Weapon Survival Strategies for Depressed AI Academics 持续更新以下内容,Star to keep updated~

  1. 开源LLM
  2. 指令微调和RLHF数据以及训练框架
  3. Prompt和LLM相关论文按细分方向梳理
  4. AIGC相关应用
  5. Prompt指南和教程
  6. ChatGPT及AGI相关解读

My blogs & ChatGPT应用

模型和数据

国外模型

模型链接 模型描述
Google Bard 谷歌bard虽迟但到,可以申请waitlist了
Claude ChatGPT最大竞争对手Claude也开放申请了,slack中无限试用
Falcon Falcon由阿联酋技术研究所在超高质量1万亿Token上训练得到1B,7B,40B开源,免费商用!土豪们表示钱什么的格局小了
LLaMA Meta开源指令微调LLM,规模70 亿到 650 亿不等
MPT MosaicML开源的预训练+指令微调的新模型,可商用,支持84k tokens超长输入
RedPajama RedPajama项目既开源预训练数据后开源3B,7B的预训练+指令微调模型
ChatLLaMA 基于RLHF微调了LLaMA
Alpaca 斯坦福开源的使用52k数据在7B的LLaMA上微调得到,
Alpaca-lora LORA微调的LLaMA
Dromedary IBM self-aligned model with the LLaMA base
Vicuna Alpaca前成员等开源以LLama13B为基础使用ShareGPT指令微调的模型,提出了用GPT4来评测模型效果
koala 使用alpaca,HC3等开源指令集+ ShareGPT等ChatGPT数据微调llama,在榜单上排名较高
ColossalChat HPC-AI Tech开源的Llama+RLHF微调
MiniGPT4 Vicuna+BLIP2 文本视觉融合
StackLLama LLama使用Stackexchange数据+SFT+RL
Cerebras Cerebras开源了1亿到130亿的7个模型,从预训练数据到参数全开源
PaLM-E 谷歌多模态大模型,540B的PaLM语言模型和22B的ViT视觉模型相结合,得到562B的PaLM-E模型,在机器人应用场景有了新的突破
Dolly-v2 可商用 7b指令微调开源模型在GPT-J-6B上微调
OpenChatKit openai研究员打造GPT-NoX-20B微调+6B审核模型过滤
MetaLM 微软开源的大规模自监督预训练模型
Amazon Titan 亚马逊在aws上增加自家大模型
OPT-IML Meta复刻GPT3,up to 175B, 不过效果并不及GPT3
Bloom BigScience出品,规模最大176B
BloomZ BigScience出品, 基于Bloom微调
Galacia 和Bloom相似,更针对科研领域训练的模型
T0 BigScience出品,3B~11B的在T5进行指令微调的模型

国内模型

模型链接 模型描述
ChatGLM 清华开源的、支持中英双语的对话语言模型,使用了代码训练,指令微调和RLHF。和以下GLM相同大小的130B的模型还在开发中。试用了下超出预期!
Moss 为复旦正名!开源了预训练,指令微调的全部数据和模型。可商用
Wombat-7B 达摩院开源无需强化学习使用RRHF对齐的语言模型, alpaca基座
TigerBot 虎博开源了7B 180B的模型以及预训练和微调语料
Chinese-LLaMA-Alpaca 哈工大中文指令微调的LLaMA
Luotuo 中文指令微调的LLaMA,和ChatGLM
文心一言 已经拿到邀请码并试用,虽然人格化程度显著低,但效果上并没有很拉胯,国产YYDS!不过商业化霸王条款确实不少
通义千问 阿里系LLM开放申请
星火 科大讯飞星火,数学是真的厉害
BiLLa LLama词表扩充预训练+预训练和任务1比1混合SFT+指令样本SFT三阶段训练
Phoenix 港中文开源凤凰和奇美拉LLM,Bloom基座,40+语言支持
OpenBuddy Llama 多语言对话微调模型
Guanaco LLama 7B基座,在alpaca52K数据上加入534K多语言指令数据微调
ziya IDEA研究院在7B/13B llama上继续预训练+SFT+RM+PPO+HFTT+COHFT+RBRS
Chinese Vincuna LLama 7B基座,使用Belle+Guanaco数据训练
Linly Llama 7B基座,使用belle+guanaco+pclue+firefly+CSL+newscommentary等7个指令微调数据集训练
Firefly 中文2.6B模型,提升模型中文写作,古文能力,待开源全部训练代码,当前只有模型
Baize 使用100k self-chat对话数据微调的LLama
BELLE 使用ChatGPT生成数据对开源模型进行中文优化
Chatyuan chatgpt出来后最早的国内开源对话模型,T5架构是下面PromptCLUE的衍生模型
PromptCLUE 多任务Prompt语言模型
PLUG 阿里达摩院发布的大模型,提交申请会给下载链接
CPM2.0 智源发布CPM2.0
GLM 清华发布的中英双语130B预训练模型

垂直领域模型

模型链接 模型描述
MedPalm Google在Faln-PaLM的基础上通过多种类型的医疗QA数据进行prompt-tuning指令微调得到,同时构建了MultiMedQA
ChatDoctor 110K真实医患对话样本+5KChatGPT生成数据进行指令微调
Huatuo Med-ChatGLM 医学知识图谱和chatgpt构建中文医学指令数据集+医学文献和chatgpt构建多轮问答数据
Chinese-vicuna-med Chinese-vicuna在cMedQA2数据上微调
OpenBioMed 清华AIR开源轻量版BioMedGPT, 知识图谱&20+生物研究领域多模态预训练模型
DoctorGLM ChatDoctor+MedDialog+CMD 多轮对话+单轮指令样本微调GLM
MedicalGPT-zh 自建的医学数据库ChatGPT生成QA+16个情境下SELF构建情景对话
PMC-LLaMA 医疗论文微调Llama
NHS-LLM Chatgpt生成的医疗问答,对话,微调模型
LawGPT-zh 利用ChatGPT清洗CrimeKgAssitant数据集得到52k单轮问答+我们根据中华人民共和国法律手册上最核心的9k法律条文,利用ChatGPT联想生成具体的情景问答+知识问答使用ChatGPT基于文本构建QA对
FinChat.io 使用最新的财务数据,电话会议记录,季度和年度报告,投资书籍等进行训练
OpenGPT 领域LLM指令样本生成+微调框架
乾元BigBang金融2亿模型 金融领域预训练+任务微调
度小满千亿金融大模型 在Bloom-176B的基础上进行金融+中文预训练和微调

指令微调&RL工具

工具描述 链接
LoRA:Low-Rank指令微调方案 https://github.com/tloen/alpaca-lora
peft:parameter-efficient prompt tunnging工具集 https://github.com/huggingface/peft
RL4LMs:AllenAI的RL工具 https://github.com/allenai/RL4LMs
trl:基于Transformer的强化训练框架 https://github.com/lvwerra/trl
trlx:分布式训练trl https://github.com/CarperAI/trlx
北大开源河狸项目可复现RLHF,支持多数LLM,提供RLHF数据 https://github.com/PKU-Alignment/safe-rlhf
RL4LMs:AllenAI的RL工具 https://github.com/allenai/RL4LMs
LMFlow:港科大实验室开源的大模型微调框架,支持以上多数开源模型的指令微调和RLHF https://github.com/OptimalScale/LMFlow
hugNLP:基于Huggingface开发继承Prompt技术,预训练和是指输入等多种方案 https://github.com/wjn1996/HugNLP
Deepspeed:针对RL训练和推理的整合优化 https://github.com/microsoft/DeepSpeed
Uerpy:预训练框架支持lm,mlm,unilm等 https://github.com/dbiir/UER-py
TecentPretrain: Uerpy的重构版本支持llama预训练 https://github.com/Tencent/TencentPretrain/tree/main
langchain:LLM工具集 https://github.com/hwchase17/langchain
BMTTools: 清华出品类似langchain https://github.com/OpenBMB/BMTools
BabyAGI:自执行LLM Agent https://github.com/yoheinakajima/babyagi
AutoGPT:自执行LLM Agent https://github.com/Torantulino/Auto-GPT
Jarvis: 大模型调用小模型框架,给小模型一个未来! https://github.com/search?q=jarvis
lamini: 整合指令数据生成,SFT,RLHF的工具库 https://github.com/lamini-ai/lamini/
wenda:闻达小模型整合搜索用于知识融入 https://github.com/l15y/wenda
Chain-of-thought-hub:模型推理能力评估平台 https://github.com/FranxYao/chain-of-thought-hub
FlexGen:LLM推理 CPU Offload计算架构 https://github.com/FMInference/FlexGen
LLM-ToolMaker:让LLM自己制造Agent https://github.com/FMInference/FlexGen
Gorilla: LLM调用大量API https://github.com/ShishirPatil/gorilla

开源数据

数据类型 数据描述 数据链接
指令微调 self-instruct,GPT3自动生成&过滤得到指令集 https://github.com/yizhongw/self-instruct
指令微调 Standford Alpaca:52K text-davinci-003生成的self-instruct指令数据集 https://github.com/tatsu-lab/stanford_alpaca
指令微调 GPT4-for-LLM 中文+英文+对比指令 https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
指令微调 GPTTeacher更多样的通用指令,角色扮演和代码指令 https://github.com/teknium1/GPTeacher/tree/main
指令微调 中文翻译Alpaca还有一些其他指令数据集 https://github.com/hikariming/alpaca_chinese_dataset https://github.com/carbonz0/alpaca-chinese-dataset
指令微调 alpaca指令GPT4生成,和以上几版对比显著质量更高,回复更长 https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/tree/main
指令微调 Guanaco数据:对Alphca指令重写后以不同语言生成总共534K,有对话和非对话类型,还有补充的QA生成样本 https://huggingface.co/datasets/JosephusCheung/GuanacoDataset
指令微调 OIG中文指令包括翻译alpaca+natural+unnatural,多轮对话,考试,leetcode指令 https://github.com/BAAI-Zlab/COIG
指令微调 Vicuna训练使用的样本,用API获取了sharegpt上用户和chatgpt对话历史,部分网友整理到了HF https://github.com/domeccleston/sharegpt https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/tree/main
指令微调 HC3指令数据中英文,包括金融,开放QA,百科,DBQA,医学等包含人工回复 https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调 MOSS开源的SFT数据包含使用plugin的对话数据 https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调 InstructWild数据:用四处爬取的chatgpt指令作为种子self-instruct扩充生成,中英双语 https://github.com/XueFuzhao/InstructionWild/tree/main/data
指令微调 BELLE100万指令数据,参考Alpaca用ChatGPT生成,有数学,多轮对话,校色对话等等 https://github.com/LianjiaTech/BELLE
指令微调 PromptCLUE多任务提示数据集:模板构建,只包含标准NLP任务 https://github.com/CLUEbenchmark/pCLUE
指令微调 TK-Instruct微调用的指令数据集, 全人工标注1600+NLP任务 https://instructions.apps.allenai.org/
指令微调 T0微调用的指令数据集(P3) https://huggingface.co/datasets/bigscience/P3
指令微调 p3衍生的46种多语言数据集(xmtf) https://github.com/bigscience-workshop/xmtf
指令微调 Unnatural Instruction使用GPT3生成后改写得到240k https://github.com/orhonovich/unnatural-instructions
指令微调 alpaca COT对多个数据源进行了清理并统一格式放到的了HF, 重点是人工整理的COT数据 https://github.com/PhoebusSi/Alpaca-CoT
指令微调 人工编写包含23种常见的中文NLP任务的指令数据,中文写作方向 https://github.com/yangjianxin1/Firefly
指令微调 Amazon COT指令样本包括各类QA,bigbench,math等 https://github.com/amazon-science/auto-cot
指令微调 CSL包含 396,209 篇中文核心期刊论文元信息 (标题、摘要、关键词、学科、门类)可做预训练可构建NLP指令任务 https://github.com/ydli-ai/CSL
指令微调 alpaca code 20K代码指令数据 https://github.com/sahil280114/codealpaca#data-release
指令微调 GPT4Tools 71K GPT4指令样本 https://github.com/StevenGrove/GPT4Tools
指令微调 GPT4指令+角色扮演+代码指令 https://github.com/teknium1/GPTeacher
数学 腾讯人工智能实验室发布网上爬取的数学问题APE210k https://github.com/Chenny0808/ape210k
数学 猿辅导 AI Lab开源小学应用题Math23K https://github.com/SCNU203/Math23k/tree/main
数学 grade school math把OpenAI的高中数学题有改造成指令样本有2-8步推理过程 https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions
数学 数学问答数据集有推理过程和多项选择 https://huggingface.co/datasets/math_qa/viewer/default/test?row=2
数学 AMC竞赛数学题 https://huggingface.co/datasets/competition_math
数学 线性代数等纯数学计算题 https://huggingface.co/datasets/math_dataset
代码 APPS从不同的开放访问编码网站Codeforces、Kattis 等收集的问题 https://opendatalab.org.cn/APPS
代码 Lyra代码由带有嵌入式 SQL 的 Python 代码组成,经过仔细注释的数据库操作程序,配有中文评论和英文评论。 https://opendatalab.org.cn/Lyra
代码 Conala来自StackOverflow问题,手动注释3k,英文 https://opendatalab.org.cn/CoNaLa/download
代码 code-alpaca ChatGPT生成20K代码指令样本 https://github.com/sahil280114/codealpaca.git
对话指令 LAION 策划的开放指令通用数据集中手动选择的组件子集 已开源40M 3万个,100M在路上 https://github.com/LAION-AI/Open-Instruction-Generalist
对话指令 Baize基于Chat GPT构建的self-chat数据 https://github.com/project-baize/baize-chatbot/tree/main/data
对话指令 FaceBook开源BlenderBot训练对话数据~6K https://huggingface.co/datasets/blended_skill_talk
对话指令 AllenAI开源38.5万个对话高质量数据集SODA https://realtoxicityprompts.apps.allenai.org/
对话指令 InstructDial在单一对话任务类型上进行指令微调 https://github.com/prakharguptaz/Instructdial
对话指令 Ultra Chat 两个独立的 ChatGPT Turbo API 进行对话,从而生成多轮对话数据 https://github.com/thunlp/UltraChat
对话指令 Awesome Open-domain Dialogue Models提供多个开放域对话数据 https://github.com/cingtiye/Awesome-Open-domain-Dialogue-Models#%E4%B8%AD%E6%96%87%E5%BC%80%E6%94%BE%E5%9F%9F%E5%AF%B9%E8%AF%9D%E6%95%B0%E6%8D%AE%E9%9B%86
RLFH 北大河狸开源RLHF数据集10K,1M需要申请 https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-10K
RLHF Anthropic hh-rlhf数据集 https://huggingface.co/datasets/Anthropic/hh-rlhf
RLHF Stack-exchange上问题对应多个答案,每个答案有打分 https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences/tree/main
RLHF Facebook Bot Adversarial Dialogues数据集5K https://github.com/facebookresearch/ParlAI
RLHF AllenAI Real Toxicity prompts https://github.com/facebookresearch/ParlAI
RLHF OpenAssistant Conversations 160K消息,13500人工生成, 英文为主 https://huggingface.co/datasets/OpenAssistant/oasst1
评估集 BigBench(Beyond the Imitation Game Benchmark) https://github.com/google/BIG-bench
评估集 Complex QA:用于ChatGPT的评测指令集 https://github.com/tan92hl/Complex-Question-Answering-Evaluation-of-ChatGPT
评估集 Langchain开源评估数据集 https://huggingface.co/LangChainDatasets
评估集 2010-2022年全国高考卷的题目 https://github.com/OpenLMLab/GAOKAO-Bench
评估集 中文通用大模型综合性评测基准SuperCLUE https://github.com/CLUEbenchmark/SuperCLUE
预训练 RedPajama开源的复刻llama的预训练数据集 https://github.com/togethercomputer/RedPajama-Data
预训练 Pile 22个高质量数据集混合的预训练数据集800G,全量开放下载 https://pile.eleuther.ai/
预训练 UER整理CLUECorpusSmall+News Commentary中英 https://github.com/dbiir/UER-py/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE
预训练 智源人工智能开源的wudao 200G预训练数据 https://github.com/BAAI-WuDao/WuDaoMM
多源数据集整合 opendatalab整合了预训练阶段的多个数据源 https://opendatalab.org.cn/?industry=9821&source=JUU3JTlGJUE1JUU0JUI5JThF

Resources

Tools & Tutorial

AIGC playground

  • cognosys: 全网最火的web端AutoGPT,不过咋说呢试用了下感觉下巴要笑掉了,不剧透去试试你就知道
  • godmode:需要人为每一步交互的的AutoGPT
  • agentgpt: 基础AutoGPT
  • New Bing:需要连外网否则会重定向到bing**,需要申请waitlist
  • Perplexity.ai: 同样需要科学上网,感觉比Bing做的更好的接入ChatGPT的神奇搜索引擎,在Bing之外还加入了相关推荐和追问 ⭐
  • BingGPT: NewBing开源桌面客户端,可以将聊天记录导出
  • DocsGPT: 把ChatGPT开放域问答转化成封闭域问答的通用方案,试用垂类领域问答场景,可以试用定制的ChatBot
  • langchain-ChatGLM: 基于ChatGLM的本地知识问答,和上面的DocsGPT相似,不过可以本地部署:star:
  • ChatPDF: 国内的ChatPDF, 上传pdf后,会给出文章的Top5可能问题,然后对话式从文档中进行问答和检索,10s读3万字
  • ChatDoc:ChatPDF升级版,增加了表格类解析,和完善的索引引用加跳转加对应文章内容高亮,哈哈我准备自己整一个
  • ChatPaper: 根据输入关键词,自动在arxiv上下载最新的论文,并对论文进行摘要总结,可以在huggingface上试用!
  • OpenRead: 面向论文写作,阅读场景,可以帮助生成文献综述,以及提供和NotionAI相似的智能Markdown用于写作
  • researchgpt: 和ChatPDF类似,支持arivx论文下载,加载后对话式获取论文重点
  • BriefGPT: 日更Arxiv论文,并对论文进行摘要,关键词抽取,帮助研究者了解最新动态, UI不错哟
  • ChatGPT-academic: 又是一个基于gradio实现的paper润色,摘要等功能打包的实现
  • feishu-chatgpt: 飞书chatgpt,和365copilot相似也是多组件集成, 有点全!
  • ChatMind: chatgpt生成思维导图,针对话题的生成还可以,但是针对某本书的就是瞎编了,但是感觉和检索式阅读方式结合效果会出彩~
  • Shell: 基于ChatGPT的AI英语聊天工具,口语练习助手
  • AI Topiah: 聆心智能AI角色聊天,和路飞唠了两句,多少有点中二之魂在燃烧
  • chatbase: 情感角色聊天,还没尝试
  • Vana: virtual DNA, 通过聊天创建虚拟自己!概念很炫
  • WriteSonic:AI写作,支持对话和定向创作如广告文案,商品描述, 支持Web检索是亮点,支持中文
  • copy.ai: WriteSonic竞品,亮点是像论文引用一样每句话都有对应网站链接,可以一键复制到右边的创作Markdown,超级好用!
  • NotionAI:智能Markdown,适用真相!在创作中用command调用AI辅助润色,扩写,检索内容,给创意idea
  • Jasper: 同上,全是竞品哈哈
  • copy.down: 中文的营销文案生成,只能定向创作,支持关键词到文案的生成
  • ChatExcel: 指令控制excel计算,对熟悉excel的有些鸡肋,对不熟悉的有点用
  • ChatPPT: 使用ChatGPT进行PPT制作
  • BibiGPT: Bilibli视频内容一键总结,多模态文档
  • Microsoft 365 Copilot:微软Office全面接入GPT4,智能PPT,Excel,Word,暂无链接。其实就是上面开源创意的全家桶套餐
  • Google Workspace: 谷歌推出的搭载各种AI服务的办公场景全覆盖,暂无使用方案。
  • Copilot: 要付费哟
  • Fauxpilot: copilot本地开源替代
  • CodeGex: 国内替代品,还没试过
  • Codeium: Copilot替代品,有免费版本支持各种plugin
  • Wolverine: 代码自我debug的python脚本
  • dreamstudio.ai: 开创者,Stable Difussion, 有试用quota
  • midjourney: 开创者,艺术风格为主
  • Dall.E: 三巨头这就凑齐了
  • ControlNet: 为绘画创作加持可控性
  • GFPGAN: 照片修复
  • Visual ChatGPT: 微软发布图像ChatGPT,对话方式进行图像生成编辑,问答
  • gemo.ai: 多模态聊天机器人,包括文本,图像,视频生成

Recommend Blog

Papers

paper List

Survey

  • A Survey of Large Language Models
  • Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing ⭐
  • Paradigm Shift in Natural Language Processing
  • Pre-Trained Models: Past, Present and Future
  • What Language Model Architecture and Pretraining objects work best for zero shot generalization
  • Towards Reasoning in Large Language Models: A Survey
  • Reasoning with Language Model Prompting: A Survey

LLM Ability Analysis & Probing

  • LARGER LANGUAGE MODELS DO IN-CONTEXT LEARNING DIFFERENTLY
  • Evidence of Meaning in Language Models Trained on Programs
  • Sparks of Artificial General Intelligence: Early experiments with GPT-4
  • How does in-context learning work? A framework for understanding the differences from traditional supervised learning
  • Why can GPT learn in-context? Language Model Secretly Perform Gradient Descent as Meta-Optimizers
  • Emerging Ability of Large Language Models
  • Rethinking the Role of Demonstrations What Makes incontext learning work?
  • Can Explanations Be Useful for Calibrating Black Box Models
  • IS CHATGPT A GENERAL-PURPOSE NATURAL LANGUAGE PROCESSING TASK SOLVER?

Tunning Free Prompt

  • GPT2: Language Models are Unsupervised Multitask Learners
  • GPT3: Language Models are Few-Shot Learners ⭐
  • LAMA: Language Models as Knowledge Bases?
  • AutoPrompt: Eliciting Knowledge from Language Models

Fix-Prompt LM Tunning

  • T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  • PET-TC(a): Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference ⭐
  • PET-TC(b): PETSGLUE It’s Not Just Size That Matters Small Language Models are also few-shot learners
  • GenPET: Few-Shot Text Generation with Natural Language Instructions
  • LM-BFF: Making Pre-trained Language Models Better Few-shot Learners ⭐
  • ADEPT: Improving and Simplifying Pattern Exploiting Training

Fix-LM Prompt Tunning

  • Prefix-tuning: Optimizing continuous prompts for generation
  • Prompt-tunning: The power of scale for parameter-efficient prompt tuning ⭐
  • P-tunning: GPT Understands Too ⭐
  • WARP: Word-level Adversarial ReProgramming

LM + Prompt Tunning

  • P-tunning v2: Prompt Tuning Can Be Comparable to Fine-tunning Universally Across Scales and Tasks
  • PTR: Prompt Tuning with Rules for Text Classification
  • PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains

Fix-LM Adapter Tunning

  • LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS ⭐
  • LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
  • Parameter-Efficient Transfer Learning for NLP
  • INTRINSIC DIMENSIONALITY EXPLAINS THE EFFECTIVENESS OF LANGUAGE MODEL FINE-TUNING

Instruction Tunning LLMs

  • Flan: FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS ⭐
  • Flan-T5: Scaling Instruction-Finetuned Language Models
  • Instruct-GPT: Training language models to follow instructions with human feedback ⭐
  • T0: MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION
  • Natural Instructions: Cross-Task Generalization via Natural Language Crowdsourcing Instructions
  • Tk-INSTRUCT: SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks
  • Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

Train for Dialogue

  • LaMDA: Language Models for Dialog Applications
  • Sparrow: Improving alignment of dialogue agents via targeted human judgements ⭐
  • BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
  • How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

Chain of Thought

  • [zero-shot-COT] Large Language Models are Zero-Shot Reasoners ⭐
  • [Manual COT] Chain of Thought Prompting Elicits Reasoning in Large Language Models ⭐
  • SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS
  • COMPLEXITY-BASED PROMPTING FOR MULTI-STEP REASONING
  • LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS
  • Solving Quantitative Reasoning Problems with Language Models
  • Specializing Smaller Language Models towards Multi-Step Reasoning
  • Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
  • TEXT AND PATTERNS: FOR EFFECTIVE CHAIN OF THOUGHT IT TAKES TWO TO TANGO
  • Decomposed Prompting A MODULAR APPROACH FOR Solving Complex Tasks
  • Solving math word problems with processand outcome-based feedback
  • CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
  • Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective
  • OlaGPT Empowering LLMs With Human-like Problem-Solving abilities

RLHF

  • Deepmind
    • Teaching language models to support answers with verified quotes
    • sparrow, Improving alignment of dialogue agents via targetd human judgements ⭐
  • openai
    • PPO: Proximal Policy Optimization Algorithms ⭐
    • Deep Reinforcement Learning for Human Preference
    • Fine-Tuning Language Models from Human Preferences
    • learning to summarize from human feedback
    • InstructGPT: Training language models to follow instructions with human feedback ⭐
    • Scaling Laws for Reward Model Over optimization ⭐
  • Anthropic
    • A General Language Assistant as a Laboratory for Alignmen
    • Red Teaming Language Models to Reduce Harms Methods,Scaling Behaviors and Lessons Learned
    • Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback ⭐
    • Constitutional AI Harmlessness from AI Feedback ⭐
    • Pretraining Language Models with Human Preferences
  • AllenAI, RL4LM:IS REINFORCEMENT LEARNING (NOT) FOR NATURAL LANGUAGE PROCESSING BENCHMARKS
  • RRHF: Rank Responses to Align Language Models with Human Feedback without tears
  • PRM:Let's verify step by step

Agent: 让模型使用工具

  • Tool Former: Toolformer: Language Models Can Teach Themselves to Use Tools ⭐
  • MRKL SystemsA modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
  • ReAct: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS ⭐
  • Self-ask: MEASURING AND NARROWING THE COMPOSITIONALITY GAP IN LANGUAGE MODELS ⭐
  • PAL: Program-aided Language Models
  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
  • OpenAGI: When LLM Meets Domain Experts
  • Tool Learning with Foundation Models
  • Tool Maker: Large Language Models as Tool Maker
  • Gorilla: Large Language Model Connected with Massive APIs

指令数据生成

  • APE: LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS ⭐
  • SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions ⭐
  • iPrompt: Explaining Data Patterns in Natural Language via Interpretable Autoprompting
  • Flipped Learning: Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
  • Fairness-guided Few-shot Prompting for Large Language Models
  • Instruction induction: From few examples to natural language task descriptions.
  • Baize An Open-Source Chat Model with Parameter-Efficient Tuning on self-Chat Data
  • SELF-QA Unsupervised Knowledge Guided alignment.

领域模型

  • BioGPT:Generative Pre-trained Transformer for Biomedical Text Generation and Mining
  • Galactia:A Large Language Model for Science
  • PubMed GPT: A Domain-specific large language model for biomedical text ⭐
  • BloombergGPT: A Large Language Model for Finance
  • ChatDoctor:Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
  • Med-PaLM:Large Language Models Encode Clinical Knowledge[V1,V2] ⭐
  • Augmented Large Language Models with Parametric Knowledge Guiding
  • XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters

LLM超长文本处理

  • Parallel Context Windows for Large Language Models
  • Structured Prompting: Scaling In-Context Learning to 1,000 Examples
  • 苏剑林, NBCE:使用朴素贝叶斯扩展LLM的Context处理长度
  • Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
  • Unlimiformer: Long-Range Transformers with Unlimited Length Input
  • Scaling Transformer to 1M tokens and beyond with RMT
  • RECURRENTGPT: Interactive Generation of (Arbitrarily) Long Text
  • TRAIN SHORT, TEST LONG: ATTENTION WITH LINEAR BIASES ENABLES INPUT LENGTH EXTRAPOLATION ⭐

LLM Tunning Practice/Report

  • BELLE: Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
  • Baize: Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
  • A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Large LM
  • Exploring ChatGPT’s Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
  • Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
  • LIMA: Less Is More for Alignment ⭐

Reliability

  • Trusting Your Evidence: Hallucinate Less with Context-aware Decoding ⭐
  • SELF-REFINE:ITERATIVE REFINEMENT WITH SELF-FEEDBACK ⭐
  • SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS ⭐
  • PROMPTING GPT-3 TO BE RELIABLE
  • Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference

Other Prompt Engineer

  • Generated Knowledge Prompting for Commonsense Reasoning
  • In-Context Instruction Learning

Multimodal

  • InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
  • Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
  • PaLM-E: An Embodied Multimodal Language Model

About

总结Prompt&LLM论文,开源数据&模型,AIGC应用