benkyoujouzu / stable-diffusion-webui-visualize-cross-attention-extension

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Potential UI improvement/Batch keyword processing?

some9000 opened this issue · comments

Hi! Once again wanted to thank you for this extension. It does provide some insight into what prompt parts do. Although the results are quite interesting at times. Either way, it is unfortunate it did not get more attention in discussions.

Anyway, thought I could try to do a little UI change to arrange things saving some space: https://github.com/some9000/stable-diffusion-webui-visualize-cross-attention-extension-UI-idea
2022-12-01 13 25 36 127 0 0 1 0c59dc461fa6

But, that was done while attempting to make it process multiple lines of input and then produce an image grid. This turned out to be waaay beyond my programming skills. So, if it was possible to try and add such an option it would be great. Basically multi-line entry of separate prompt elements which are then generated into a labeled list of images, as gradio can do.

Thank you for your advice. I have changed the arrangement of the UI.

For the batch processing, the problem is that the implementation of stable diffusion in webui will pad every encoded prompt to 77 tokens. If we just visualize all the tokens (that's the case when the indices box is left blank), the result is usually poor. It will need some hack to obtain the actual length of the encoded input, and I think it's not a good idea to do this in an extension.

Hope there will be an API for this in the future.