iuliaturc / detextify

Remove text from AI-generated images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The generated effect will change the original image

centerqi opened this issue · comments

Thanks for flagging this!

This happens with the Stable Diffusion in-painters when the input image is larger than 512 x 512. We split the image into multiple tiles, and call in-painting on the tiles with text (in your case, top-left tile). Unfortunately, SD changes the colors a little bit (despite the mask), hence the effect that you're seeing.

We should think of work-arounds though, so I'll leave this issue open as a feature request.

In the meantime, if you want to avoid this effect, you can use the DalleInapinter, which accepts images larger than 512 x 512, and we don't need to perform tiling.

Thanks for your reply, I have an idea that might be more efficient.

  1. Direct OCR to find the position of the text.
  2. Carry out color segmentation of text blocks.
  3. Generate a mask.
  4. Direct inpaint

this might also be the jpeg conversion you're doing. See issue #19

@centerqi Submitted a fix to only replace the in-painted text boxes, instead of the entire tiles. This doesn't fix the underlying problem (edges are still visible), but in practice it's less jarring for a lot of images.

@scruffynerf The conversion only happens for the ReplicateSDInpainter (not the local one), so it doesn't fully justify the discrepancy. Thanks for looking into it though!

Before

(edges visible across the entire image)
octopus_100steps
e):

After

(edge still visible top-center, but less jarring):
octopus_detext