The generated effect will change the original image

Question

The generated effect will change the original image

centerqi opened this issue 2 years ago · comments

Julia Turc · Answer 1 · Sat Dec 17 2022 01:31:16 GMT+0800 (China Standard Time)

Thanks for flagging this!

This happens with the Stable Diffusion in-painters when the input image is larger than 512 x 512. We split the image into multiple tiles, and call in-painting on the tiles with text (in your case, top-left tile). Unfortunately, SD changes the colors a little bit (despite the mask), hence the effect that you're seeing.

We should think of work-arounds though, so I'll leave this issue open as a feature request.

In the meantime, if you want to avoid this effect, you can use the DalleInapinter, which accepts images larger than 512 x 512, and we don't need to perform tiling.

centerqi · Answer 2 · Mon Dec 19 2022 12:01:54 GMT+0800 (China Standard Time)

Thanks for your reply, I have an idea that might be more efficient.

Direct OCR to find the position of the text.
Carry out color segmentation of text blocks.
Generate a mask.
Direct inpaint

scruffynerf · Answer 3 · Mon Jan 02 2023 10:45:01 GMT+0800 (China Standard Time)

this might also be the jpeg conversion you're doing. See issue #19

Julia Turc · Answer 4 · Thu Jan 05 2023 02:56:38 GMT+0800 (China Standard Time)

@centerqi Submitted a fix to only replace the in-painted text boxes, instead of the entire tiles. This doesn't fix the underlying problem (edges are still visible), but in practice it's less jarring for a lot of images.

@scruffynerf The conversion only happens for the ReplicateSDInpainter (not the local one), so it doesn't fully justify the discrepancy. Thanks for looking into it though!

Before

(edges visible across the entire image)

e):

After

(edge still visible top-center, but less jarring):