YangLing0818 / RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Home Page:https://arxiv.org/abs/2401.11708

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggestion - using OpenDalle/DPO model as default

C00reNUT opened this issue · comments

Hello,
thank you for your work and making it public, it's a very nice idea.

I just wanted to suggest you to use OpenDalle https://civitai.com/models/238116?modelVersionId=275681 or other DPO finetuned mdoel https://civitai.com/models/239836?modelVersionId=270839

In my experiments OpenDalle is better. I follows prompt better than regular SD XL models.

Hello, thank you for your work and making it public, it's a very nice idea.

I just wanted to suggest you to use OpenDalle https://civitai.com/models/238116?modelVersionId=275681 or other DPO finetuned mdoel https://civitai.com/models/239836?modelVersionId=270839

In my experiments OpenDalle is better. I follows prompt better than regular SD XL models.

Thanks for your constructive suggestions. We will consider to try these models in next update.