A curated list of open source Human Preference datasets for LLM instruction-tuning, RLHF and evaluation.
For general NLP datasets and text corpora, check out this awesome list.
- 20k comparisons where each example comprises a question, a pair of model answers, and human-rated preference scores for each answer.
- RLHF dataset used to train the OpenAI WebGPT reward model.
- 64k text summarization examples including human-written responses and human-rated model responses.
- RLHF dataset used in the OpenAI Learning to Summarize from Human Feedback paper.
- Explore sample data here.
Anthropic Helpfulness and Harmlessness Dataset (HH-RLHF)
- In total 170k human preference comparisons, including human preference data collected for Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback and human-generated red teaming data from Red Teaming Language Models to Reduce Harms, divided into 3 sub-datasets:
- A base dataset using a context-distilled 52B model, with 44k helpfulness comparisons and 42k red-teaming (harmlessness) comparisons.
- A RS dataset of 52k helpfulness comparisons and 2k red-teaming comparisons using rejection sampling models, where rejection sampling used a preference model trained on the base dataset.
- An iterated online dataset including data from RLHF models, updated weekly over five weeks, with 22k helpfulness comparisons.
OpenAssistant Conversations Dataset (OASST1)
- A human-generated, human-annotated assistant-style conversation corpus consisting of 161k messages in 35 languages, annotated with 461k quality ratings, resulting in 10k+ fully annotated conversation trees.
Stanford Human Preferences Dataset (SHP)
- 385K collective human preferences over responses to questions/instructions in 18 domains for training RLHF reward models and NLG evaluation models. Datasets collected from Reddit.
- 270k examples of questions, answers and scores collected from 3 Q&A subreddits.
Human ChatGPT Comparison Corpus (HC3)
- 60k human answers and 27K ChatGPT answers for around 24K questions.
- Sibling dataset available for Chinese.
HuggingFace H4 StackExchange Preference Dataset
- 10 million questions (with >= 2 answers) and answers (scored based on vote count) from Stackoverflow.
- 90k (as of April 2023) user-uploaded ChatGPT interactions.
To access the data using ShareGPT's API, see documentation hereThe ShareGPT API is currently disabled ("due to excess traffic").- Precompliled datasets on HuggingFace.
- 52k instructions and demonstrations generated by OpenAI's text-davinci-003 engine for self-instruct training.
- 1M prompt-response pairs colleced using GPT-3.5-Turbo API in March 2023. GitHub repo.
- 15k instruction-following records generated by Databricks employees in categories including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.
- 42k harmless data, same prompts and "rejected" responses as the Harmless dataset in Anthropic HH datasets, but the responses in the "chosen" responses are re-writtened using GPT4 to yield more harmless answers. The comparison before and after re-written can be found here. Empirically, compared with the original Harmless dataset, training on this dataset improves the harmless metrics for various alignment methods such as RLHF and DPO.