[BUG] Using lora with prompt is much slower than with nodes

Question

[BUG] Using lora with prompt is much slower than with nodes

PurplefinNeptuna opened this issue 2 months ago · comments

Description

When using LoRA with prompt, it always slower than using LoRA with Load LoRA nodes
although in both case, time taken per step is similar. maybe because load/patching?

My Setup

CPU: AMD Ryzen R7 5700X
GPU: AMD RX 6700XT
ComfyUI args: --normalvram --use-pytorch-cross-attention --disable-xformers
ComfyUI running on linux, so it use rocm, not directml

To Reproduce

with this workflow which load 2 LoRAs using prompt
1st queue is 21s, afterward it's around 15s

Expected Behavior

with this workflow load 2 LoRA using Load LoRA nodes
1st queue is 13s, afterward it's around 8s

asagi4 · Answer 1 · Sat Jun 08 2024 19:38:54 GMT+0800 (China Standard Time)

I think this happens because my node always unpatches the model after sampling to avoid messing up the model weights, so it needs to repatch constantly.

Try the https://github.com/asagi4/comfyui-prompt-control/tree/load_optimization branch; I haven't merged it into master yet because I'm not sure that I've dealt with all the bugs introduced by the weight shuffling, but it avoids the constant repatching

You'll also need to set the COMFYUI_PC_CACHE_MODEL environment variable to a non-empty value to enable the model caching.

Also note that I may rebase and force-push to that branch if I make changes to master, so if you use it you'll need to do the occasional hard reset to your own clone.

Ilham AJ · Answer 2 · Sun Jun 16 2024 16:13:10 GMT+0800 (China Standard Time)

Traceback (most recent call last):
  File "/home/purplefin/Programs/StabilityMatrix/Data/Packages/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/home/purplefin/Programs/StabilityMatrix/Data/Packages/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/home/purplefin/Programs/StabilityMatrix/Data/Packages/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/home/purplefin/Programs/StabilityMatrix/Data/Packages/ComfyUI/custom_nodes/comfyui-prompt-control/prompt_control/node_aio.py", line 32, in apply
    pos_cond = pos_filtered = control_to_clip_common(clip, pos_sched, self.lora_cache, cond_cache)
AttributeError: 'PromptControlSimple' object has no attribute 'lora_cache'

I got this error when using that branch

Ilham AJ · Answer 3 · Sun Jun 16 2024 19:00:06 GMT+0800 (China Standard Time)

it works when I use normal nodes (prompt to sched, sched to cond, etc)

but now I found an issue, ScheduleToModel doesnt works with Efficient Node's KSampler. When I change only the lora, that sampler still give me cached results.

asagi4 · Answer 4 · Sun Jun 16 2024 19:10:21 GMT+0800 (China Standard Time)

@PurplefinNeptuna can you give me an example workflow of it failing? If possible, try to minimize the workflow so that there are no extra elements.

Ilham AJ · Answer 5 · Sun Jun 16 2024 20:34:41 GMT+0800 (China Standard Time)

@asagi4 here's the workflows:
PCSFail.json <- Prompt control simple not working

stmpatchfail.json <- ScheduleToModel failed to patch/unpatch lora for KSampler (Eff.)

I'm using load optimization branch

asagi4 · Answer 6 · Mon Jun 17 2024 03:02:17 GMT+0800 (China Standard Time)

Hmm, there's something weird going on. Looking at terminal output, if I modify the LoRA in the prompt, CLIP patching indicates that it's handled correctly, but for whatever reason the nodes afterwards don't detect that the cond has changed and seem to just output a cached value.

asagi4 · Answer 7 · Mon Jun 17 2024 03:08:24 GMT+0800 (China Standard Time)

I added a quick fix for the AIO node. Not sure what's going on with the Efficient KSampler

Ilham AJ · Answer 8 · Wed Jun 19 2024 16:42:05 GMT+0800 (China Standard Time)

@asagi4 Found new issue when using load optimization branch, should I continue report it here or make new issue?

asagi4 · Answer 9 · Wed Jun 19 2024 18:20:07 GMT+0800 (China Standard Time)

@PurplefinNeptuna It's fine to report here if it's specific to the load_optimization branch.

Ilham AJ · Answer 10 · Thu Jun 20 2024 00:00:36 GMT+0800 (China Standard Time)

@asagi4 the issue is when I put 2 or more lora into prompt, if I change the weight of one of them, it somehow will unpatch all lora but only patching lora with new weight.

example:
first queue:
<lora:a:1.0><lora:b:1.0>
second queue:
<lora:a:0.8><lora:b:1.0>

what happened is in 2nd run, lora b is got unpatched and never repatched until I change all the lora's weight (because if I only change lora b's weight, lora a will get unpatched)

sadly I can't give you workflow or image, because my drive just broken, lost all my data, including comfyui.

asagi4 · Answer 11 · Thu Jun 20 2024 00:40:12 GMT+0800 (China Standard Time)

@PurplefinNeptuna I think I hit this bug myself too just yesterday. There's probably some bug in how the state of the model is tracked.

asagi4 · Answer 12 · Thu Jun 20 2024 00:52:38 GMT+0800 (China Standard Time)

@PurplefinNeptuna I think I fixed it. Try the latest commit on the branch.