Awesome-Align-LLM-Human

A collection of papers and resources about aligning large language models (LLMs) with human.

Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect (hallucinated) information. Hence, aligning LLMs with human expectations has become an active area of interest within the research community. This survey presents a comprehensive overview of these alignment technologies, including the following aspects. (1) Data collection (2) Training methodologies (3) Model Evaluation. In conclusion, we collate and distill our findings, shedding light on several promising future research avenues in the field. This survey, therefore, serves as a valuable resource for anyone invested in understanding and advancing the alignment of LLMs to better suit human-oriented tasks and expectations.

We hope this repository can help researchers and practitioners to get a better understanding of this emerging field. If this repository is helpful for you, please help us by citing this paper:

@article{aligning_llm_human,
    title={Aligning Large Language Models with Human: A Survey},
    author={Yufei Wang and Wanjun Zhong and Liangyou Li and Fei Mi and Xingshan Zeng and Wenyong Huang and Lifeng Shang and Xin Jiang and Qun Liu},
    journal={arXiv preprint arXiv:2307.12966},
    year={2023}
}

News

🔭 This project is under development. You can hit the STAR and WATCH to follow the updates.

2023/07/31: Our survey paper is put into [Podcast @ papersread.ai]
2023/07/25: Our initial survey paper Aligning Large Language Models with Human: A Survey becomes available.

News
Awesome-Aligning-LLM-Human

Related Surveys

A Survey of Large Language Models [Paper]
A Survey on Multimodal Large Language Models [Paper]
A Survey on Evaluation of Large Language Models [Paper]
Challenges and Applications of Large Language Models [Paper]
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond [Paper]
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [Paper]
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation [Paper]
Unifying Large Language Models and Knowledge Graphs: A Roadmap [Paper]
Tool Learning with Foundation Models [Paper]
Eight Things to Know about Large Language Models [Paper]
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback [Paper]
A Stage Review of Instruction Tuning [Blog]

Alignment Data

Data From Human

NLP Benchmarks

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts [paper]
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks [paper]
The FLAN collection: Designing data and methods for effective instruction tuning [paper]
The OIG Dataset [Blog]
ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human [Paper]
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks [Paper]
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization [Paper]
Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models [Paper]

Domain Knowledge

Learning A Foundation Language Model for Geoscience Knowledge Understanding and Utilization [Paper]
Lawyer LLaMA Technical Report [Paper]
HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge [Paper]
PMC-LLaMA: Further Finetuning LLaMA on Medical Papers [Paper]
Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain [Paper]

Hand-crafted Instructions

Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM [Blog]
OpenAssistant Conversations -- Democratizing Large Language Model Alignment [Paper]
Chinese open instruction generalist: A preliminary release [Paper]
ShareGPT [Blog]
Let's Verify Step by Step [Paper]
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset [Paper]
The Importance of Human-Labeled Data in the Era of LLMs [Paper]

Human Preference Data

Training language models to follow instructions with human feedback [Paper]
Improving alignment of dialogue agents via targeted human judgements [Paper]
Fine-Tuning Language Models from Human Preference [Paper]
Teaching language models to support answers with verified quotes [Paper]
WebGPT: Browser-assisted question-answering with human feedback [Paper]

Data From Strong LLMs

General Instructions

Improving Input Quality

Self-Instruct: Aligning Language Models with Self-Generated Instructions [Paper]
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions [Paper]
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data [Paper]
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias [Paper]
WizardLM: Empowering Large Language Models to Follow Complex Instructions [Paper]
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor [paper]
Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation [Paper]
Exploring Format Consistency for Instruction Tuning [Paper]

Improving Output Quality

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models [Paper]
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 [Paper]
Lion: Adversarial Distillation of Closed-Source Large Language Model [Paper]
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision [Paper]
ExpertPrompting: Instructing Large Language Models to be Distinguished Experts [Paper]
Phoenix: Democratizing ChatGPT across Languages [Paper]
Improving Cross-Task Generalization with Step-by-Step Instructions [Paper]
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning [Paper]

Reasoning Instructions

General Reasoning

Specializing Smaller Language Models towards Multi-Step Reasoning [Paper]
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes [Paper]
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks [Paper]
PaD: Program-aided Distillation Specializes Large Models in Reasoning [Paper]

Code

Textbooks Are All You Need [Paper]
WizardCoder: Empowering Code Large Language Models with Evol-Instruct [Paper]
Code Alpaca: An Instruction-following LLaMA model for code generation [Github]
CodeT5+: Open Code Large Language Models for Code Understanding and Generation [Paper]
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback [Paper]

Maths

MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [Paper]
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks [Paper]
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models [Paper]

Conversational Instructions

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality [Blog]
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data [Paper]
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations [Paper]
CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society [Paper]
Selfee: Iterative self-revising llm empowered by self-feedback generation [Blog]
An Effective Data Creation Pipeline to Generate High-quality Financial Instruction Data for Large Language Model [Paper]

Multilingual Instructions

Phoenix: Democratizing ChatGPT across Languages [Paper]
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models [Paper]
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation [Paper]
Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction [Paper]

Instructions Management

Instruction Implications

How far can camels go? exploring the state of instruction tuning on open resources [Paper]
Flacuna: Unleashing the problem solving power of vicuna using flan fine-tuning [Paper]
Scaling data-constrained language models [Paper]
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation [Paper]
The False Promise of Imitating Proprietary LLMs [Paper]
Fundamental Limitations of Alignment in Large Language Models [Paper]

Instruction Quantity

Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning [Paper]
LIMA: Less Is More for Alignment [Paper]
Instruction Mining: High-Quality Instruction Data Selection for Large Language Models [Paper]
AlpaGasus: Training A Better Alpaca with Fewer Data [Paper]
Maybe Only 0.5% Data is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning [Paper]

Alignment Training

Online Human Alignment

Training language models to follow instructions with human feedback [Paper]
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [Paper]
Constitutional AI: Harmlessness from AI Feedback [[Paper]](Constitutional AI: Harmlessness from AI Feedback)
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment [Paper]

Offline Human Alignment

Rank-based Training

Direct Preference Optimization: Your Language Model is Secretly a Reward Model [Paper]
Preference Ranking Optimization for Human Alignment [Paper]
RRHF: Rank Responses to Align Language Models with Human Feedback without tears [Paper]
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback [Paper]
Calibrating Sequence likelihood Improves Conditional Language Generation [Paper]
Making Large Language Models Better Reasoners with Alignment [Paper]

Language-based Training

OpenChat: Less is More for Open-source Models [Github]
Languages are rewards: Hindsight finetuning using human feedback [Paper]
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits [Paper]
Training Socially Aligned Language Models in Simulated Human Society [Paper]
Selfee: Iterative self-revising llm empowered by self-feedback generation [Blog]
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training [Paper]

Parameter-Efficient Training

LoRA: Low-Rank Adaptation of Large Language Models [Paper]
QLoRA: Efficient Finetuning of Quantized LLMs [Paper]
Prefix-Tuning: Optimizing Continuous Prompts for Generation [Paper]
The Power of Scale for Parameter-Efficient Prompt Tuning [Paper]
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning [Paper]
Parameter-Efficient Fine-Tuning Design Spaces [Paper]
HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation [Paper]

Model Architecture Design

Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models [Paper]
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions [Paper]

Alignment Evaluation

Evaluation Design Principles

Sparks of Artificial General Intelligence: Early experiments with GPT-4 [Paper]
Efficiently Measuring the Cognitive Ability of LLMs: An Adaptive Testing Perspective [Paper]
Holistic Evaluation of Language Models [Paper]

Evaluation Benchmarks

Closed-set Benchmarks

General Knowledge

Measuring Massive Multitask Language Understanding [Paper]
CMMLU: Measuring massive multitask language understanding in Chinese [Paper]
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models [Paper]
KoLA: Carefully Benchmarking World Knowledge of Large Language Models [Paper]
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models [Paper]
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models [Paper]
Measuring Massive Multitask Chinese Understanding [Paper]
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation [Paper]
TABLET: Learning From Instructions For Tabular Data [Paper]
Can Language Models Understand Physical Concepts? [Paper]

Reasoning

Training Verifiers to Solve Math Word Problems [Paper]
Measuring Massive Multitask Language Understanding [Paper]
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge [Paper]
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies [Paper]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models [Paper]
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them [Paper]
Program Synthesis with Large Language Models [Paper]
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation [Paper]
Evaluating Large Language Models Trained on Code [Paper]
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation [Paper]
RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems [Paper]
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation [Paper]
StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code [Paper]

Open-set Benchmarks

General Chat

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality [Blog]
Self-Instruct: Aligning Language Models with Self-Generated Instructions [Paper]
OpenAssistant Conversations -- Democratizing Large Language Model Alignment [Paper]
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets [Paper]
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena [Paper]
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback [Paper]

Safety

Safety Assessment of Chinese Large Language Models [Paper]
CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility [Paper]
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models [Paper]
TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models [Paper]

Long Context

L-Eval: Instituting Standardized Evaluation for Long Context Language Models [Paper]

Evaluation Paradigms

Human-based Evaluation

Self-Instruct: Aligning Language Models with Self-Generated Instructions [Paper]
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions [Paper]
Training language models to follow instructions with human feedback [Paper]
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena [Paper]

LLMs-based Evaluation

LLMs for Evaluation

G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment [Paper]
GPTScore: Evaluate as You Desire [Paper]
Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study [Paper]
Can Large Language Models Be an Alternative to Human Evaluations? [Paper]
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation [Paper]
AlignScore: Evaluating Factual Consistency with A Unified Alignment Function [Paper]
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT [Paper]
Human-like Summarization Evaluation with ChatGPT [Paper]
Large Language Models Are State-of-the-Art Evaluators of Code Generation [Paper]
Benchmarking Foundation Models with Language-Model-as-an-Examiner [Paper]
LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models [Paper]
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond [Paper]

LLMs bias in Evaluation

Large Language Models are not Fair Evaluators [Paper]
Style Over Substance: Evaluation Biases for Large Language Models [Paper]
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena [Paper]

Evaluation-specific LLMs

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization [Paper]
Wider and Deeper LLM Networks are Fairer LLM Evaluators [Paper]
Shepherd: A Critic for Language Model Generation [Paper]

Alignment Toolkits

Llama V1 & V2 [Github] [Paper V1] [Paper V2]
Llama-X: Open Academic Research on Improving LLaMA to SOTA LLM [Github]
Llama2-Chinese [Github]
Colossal-AI: Making large AI models cheaper, faster, and more accessible. [Github]
Training and serving large-scale neural networks with auto parallelization. [Github]
FastChat [Github]
LMFlow [Github]
LLaMA2-Accessory: An Open-source Toolkit for LLM Development [Github]