77 token limit

Question

77 token limit

imba-pericia opened this issue 6 months ago · comments

Is it potentially possible to use compel? I tried to add it but it complains about the tokenizer. Not very good at coding, should it work or shouldn't I try?

In general, today, imho, segmoe gives the best results. I'm delighted)

No hiresfix, no upscale.

Yatharth Gupta · Answer 1 · Wed Feb 21 2024 03:22:25 GMT+0800 (China Standard Time)

The outputs you shared look amazing! Its absolutely possible to use compel just as you would use it with diffusers, here is an example

For SDXL based SegMoEs:

from compel import Compel, ReturnedEmbeddingsType
from segmoe import SegMoEPipeline

t2i = SegMoEPipeline("segmind/SegMoE-4x2-v0", device = "cuda")
compel = Compel(tokenizer=[t2i.pipe.tokenizer, t2i.pipe.tokenizer_2] , text_encoder=[t2i.pipe.text_encoder, t2i.pipe.text_encoder_2], returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED, requires_pooled=[False, True])

prompt = "Milky Way. Night sky with stars and silhouette of a standing happy man with yellow light. Space background, (sharp focus:1.2), extremely detailed, (photorealistic:1.4), (RAW image, 8k high resolution:1.2), RAW candid cinema, 16mm, color graded Portra 400 film, ultra realistic, subsurface scattering, ray tracing, (volumetric lighting), extreme contrast, intricate details, reflections on ice, reflections on water, water pouring down"
conditioning, pooled = compel(prompt)

img = t2i(
    prompt_embeds=conditioning, 
    pooled_prompt_embeds=pooled,
    height=1024,
    width=1024,
    num_inference_steps=25,
    guidance_scale=7.5,
).images[0]
img.save(f"image.png")

For SD Based SegMoEs

from segmoe import SegMoEPipeline
from compel import Compel

t2i = SegMoEPipeline("segmind/SegMoE-SD-4x2-v0", device = "cuda")
compel = Compel(tokenizer=t2i.pipe.tokenizer, text_encoder=t2i.pipe.text_encoder)


prompt = "Milky Way. Night sky with stars and silhouette of a standing happy man with yellow light. Space background, (sharp focus:1.2), extremely detailed, (photorealistic:1.4), (RAW image, 8k high resolution:1.2), RAW candid cinema, 16mm, color graded Portra 400 film, ultra realistic, subsurface scattering, ray tracing, (volumetric lighting), extreme contrast, intricate details, reflections on ice, reflections on water, water pouring down"
prompt_embeds = compel(prompt)

img = t2i(
    prompt_embeds=prompt_embeds,
    height=1024,
    width=1024,
    num_inference_steps=25,
    guidance_scale=7.5,
).images[0]
img.save(f"image.png")

I hope this helps!

imba-pericia · Answer 2 · Wed Feb 21 2024 13:38:44 GMT+0800 (China Standard Time)

I hope this helps!

Thank you, it started, I will test it.

imba-pericia · Answer 3 · Thu Feb 22 2024 16:45:01 GMT+0800 (China Standard Time)

Tested it very quickly, it seemed to me that the quality had decreased, the accuracy of long prompts had increased, I took very long prompts for testing, perhaps they were initially “crooked”.
Tried to add - truncate_long_prompts = False

self.compel = Compel(
            tokenizer=[self.pipeline.pipe.tokenizer, self.pipeline.pipe.tokenizer_2],
            text_encoder=[self.pipeline.pipe.text_encoder, self.pipeline.pipe.text_encoder_2],
            returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
            requires_pooled=[False, True],
            truncate_long_prompts = False
        )

And - max_embeddings_multiples=3

img = self.pipeline(
                prompt_embeds=conditioning,
                pooled_prompt_embeds=pooled,
                height=int(height),
                width=int(width),
                num_inference_steps=int(num_inference_steps),
                guidance_scale=float(guidance_scale),
                max_embeddings_multiples=3
            ).images[0]

imba-pericia · Answer 4 · Thu Feb 22 2024 16:54:38 GMT+0800 (China Standard Time)

(found footage DOF, Aperture, character, hypermaximalist, slutty, beautiful, exotic, rev
ealing, appealing,:1.3), (provocative legwear:1.3), (full body photo:1.3), dynamic scene, action packed, solo, headdress, (v
oluminous petticoat:1.2 skirt black shiny satin), grand interior, baroque elements, elegant, detailed, 8k resolution, (Royal
ty:1.3), she is feeling furious, Fighter, subtle cheeks and Pouty lips and Symmetrical shaped face, in Mysterious Chaotic Tr
ansparent Unique Neon Lighting, The Chaotic Transparent Unique Neon Lighting is inspired by fantasy, pointe pose, Gray hair 
styled as Bald, Cluttered Colorful Ruff, Funny Glasses, Sun in the sky, horizon-centered, Vivid (best quality, masterpiece:1
.2), photorealistic

Before:

After:

Yatharth Gupta · Answer 5 · Fri Feb 23 2024 00:22:14 GMT+0800 (China Standard Time)

It might be an effect of compel, having more tokens might be having a negative impact on the quality.