openai / glide-text2im

GLIDE: a diffusion-based text-conditional image synthesis model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get the results closer to what is shown in the paper?

kaloyan-chernev opened this issue · comments

Really inspirational work guys!

But the results from the published code and models are not even remotely comparable to the shown results in the paper. Is there anything we can do to get closer to the original work?

  • E.g. could we train on different (maybe bigger and more diverse) dataset?
  • Or do we need bigger model?
  • Or maybe tweaking the params a bit could help?

Image from the paper for: "a surrealist dream-like oil painting by salvador dalı́ of a cat playing checkers"

image

Image from the code for the same text prompt "a surrealist dream-like oil painting by salvador..."

image

It's almost like that meme: " Your vs. The guy she told you not to worry about" 🤣

Anyway, if you can give us some advice on this matter it would be greatly appreciated! 👍

We have not released the full GLIDE model--only GLIDE (filtered) which is 10x smaller than the original model and trained on a much more restricted dataset. We hope this model is still useful for future research, but it won't be able to reproduce the best images in the paper because of these limitations.