generative-ai human-ai-alignment optimization-algorithms rlhf text-to-image

Quality Diversity through Human Feedback

Project Page | Paper | Demo (new) | Talk | Tutorial | Cite

Official Python implementation of Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization (ICML 2024 & Spotlight at NeurIPS 2023 ALOE Workshop)

Li Ding, Jenny Zhang, Jeff Clune, Lee Spector, Joel Lehman

TL;DR: QDHF enhances QD algorithms by inferring diversity metrics from human judgments of similarity, surpassing state-of-the-art methods in automatic diversity discovery in robotics & RL tasks and significantly improving performance in open-ended generative tasks.

QDHF (right) improves the diversity in text-to-image generation results compared to best-of-N (left) using Stable Diffusion.

Updates

2024-06-24: Release of the QDHF Gradio Demo on Hugging Face.
2024-03-14: Release of the QDHF tutorial in pyribs.
2023-12-13: Initial release of the codebase.

Demo (new)

We have released a Gradio Demo on Hugging Face. This user-friendly interface enables effortless exploration of QDHF without any coding requirements. Special thanks to Jenny Zhang for her contributions!

Tutorial

We have released a tutorial: Incorporating Human Feedback into Quality Diversity for Diversified Text-to-Image Generation, together with the pyribs team. This tutorial features a lightweight version of QDHF and runs on Google Colab in ~1 hour. Dive into the tutorial to explore how QDHF enhances GenAI models with diversified, high-quality responses and apply these insights to your projects!

Requirements

To install the requirements, run:

pip install -r requirements.txt

Usage

For each experiment, we provide a main.py script to run the experiment. For example, to run the robotic arm experiment, run:

cd arm
python3 main.py

Replace arm with the name of the experiment you want to run.

Citation

If you find our work or any of our materials useful, please cite our paper:

@inproceedings{
      ding2024quality,
      title={Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization},
      author={Li Ding and Jenny Zhang and Jeff Clune and Lee Spector and Joel Lehman},
      booktitle={Forty-first International Conference on Machine Learning},
      year={2024},
      url={https://openreview.net/forum?id=9zlZuAAb08}
}

License

This project is under the MIT License.

Acknowledgments

The main structure of this code is modified from the DQD. Each experiment contains its own modified version of pyribs, a quality diversity optimization library. The maze navigation experiment uses a modified version of Kheperax. The LSI experiment uses Stable Diffusion (huggingface/diffusers), OpenAI CLIP, and DreamSim. The funding acknowledgments are disclosed in the paper.

About

Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization (ICML 2024)

https://liding.info/qdhf

generative-ai human-ai-alignment optimization-algorithms rlhf text-to-image

MIT License

Languages

Language:Python 100.0%