lora inference gets dismatch size
MyLifeisGettingBetter opened this issue · comments
I trained a lora on single RTX 4090 (24GB) with config setting below:
experiment_id: lora_liang_V1
checkpoint_path: /data/py_project/StableCascade/models
output_path: /data/py_project/StableCascade/models
model_version: 1B
WandB
wandb_project: StableCascade
wandb_entity: wandb_username
TRAINING PARAMS
lr: 1.0e-4
batch_size: 4
image_size: 768
multi_aspect_ratio: [1/1, 1/2, 1/3, 2/3, 3/4, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 9/16]
grad_accum_steps: 4
updates: 10000
backup_every: 1000
save_every: 100
warmup_updates: 1
use_fsdp: True -> FSDP doesn't work at the moment for LoRA
use_fsdp: False
GDF
adaptive_loss_weight: True
LoRA specific
module_filters: ['.attn']
rank: 4
train_tokens:
- ['^snail', null] # token starts with "snail" -> "snail" & "snails", don't need to be reinitialized
- ['[liang]', '^girl'] # custom token [snail], initialize as avg of snail & snails
ema_start_iters: 5000
ema_iters: 100
ema_beta: 0.9
webdataset_path: file:/data/py_project/StableCascade/liang.tar
effnet_checkpoint_path: models/effnet_encoder.safetensors
previewer_checkpoint_path: models/previewer.safetensors
generator_checkpoint_path: models/stage_c_lite_bf16.safetensors
I got lora directory below
I've changed model_version to 1B in configs/inference/lora_c_3b.yaml, I got error like
problem solved!!!