lukemelas / realfusion

Official code for "RealFusion: 360° Reconstruction of Any Object from a Single Image" (CVPR 2023)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Textual Inversion code giving error

anandvarrier opened this issue · comments

Description

Hi,
@lukemelas, great work. Wanted something like this for a while. Your model's accuracy is better than earlier versions of 2D to 3D models.

I am running all my code on Google Collab(free version). I am following the Readme, however, I encountered the following error at the Text Inversion step. I had to edit few lines to make it run but in no vain. @lukemelas or anyone could you kindly help me out in setting up the code?

I am uploading 2 screenshots for reference.

Thank you
Screenshot (32)
Screenshot (33)

Steps to Reproduce

.

Expected Behavior

I expected the given code to run as per readme document.

Environment

Google Collab, Python 3.10

Hello!

Perhaps I missed it in your post, but I don't see the error message. Can you provide the error message?

Luke

Yes I missed out on that.
Error message:
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["id2label"] will be overriden.
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["bos_token_id"] will be overriden.
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["eos_token_id"] will be overriden.
2023-07-24 11:20:51.936670: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/content/realfusion/textual-inversion/textual_inversion.py", line 925, in
main()
File "/content/realfusion/textual-inversion/textual_inversion.py", line 574, in main
accelerator = Accelerator(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 369, in init
trackers = filter_trackers(log_with, self.logging_dir)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 725, in filter_trackers
raise ValueError(
ValueError: Logging with tensorboard requires a logging_dir to be passed in.

Hi @lukemelas,
When I am running the code below:

from transformers.pipelines.base import Pipeline
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")

! export DATA_DIR="/content/realfusion/examples/natural-images/banana_1"
! export OUTPUT_DIR="/content/realfusion/examples/Output_Folder"

!python /content/realfusion/textual-inversion/textual_inversion.py
--pretrained_model_name_or_path=$MODEL_NAME
--train_data_dir= DATA_DIR
--learnable_property="object"
--placeholder_token="banana"
--initializer_token="banana"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=4
--max_train_steps=3000
--learning_rate=5.0e-04 --scale_lr
--lr_scheduler="constant"
--lr_warmup_steps=0
--output_dir=OUTPUT_DIR
--use_augmentations

I am getting the following error:
text_config_dictis provided which will be used to initializeCLIPTextConfig. The value text_config["id2label"]will be overriden.text_config_dictis provided which will be used to initializeCLIPTextConfig. The value text_config["bos_token_id"]will be overriden.text_config_dictis provided which will be used to initializeCLIPTextConfig. The value text_config["eos_token_id"]` will be overriden.
2023-07-25 10:34:47.636451: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
usage: textual_inversion.py
[-h]
[--save_steps SAVE_STEPS]
[--only_save_embeds]
--pretrained_model_name_or_path
PRETRAINED_MODEL_NAME_OR_PATH
[--revision REVISION]
[--tokenizer_name TOKENIZER_NAME]
--train_data_dir
TRAIN_DATA_DIR
--placeholder_token
PLACEHOLDER_TOKEN
--initializer_token
INITIALIZER_TOKEN
[--learnable_property LEARNABLE_PROPERTY]
[--repeats REPEATS]
[--output_dir OUTPUT_DIR]
[--seed SEED]
[--resolution RESOLUTION]
[--center_crop]
[--train_batch_size TRAIN_BATCH_SIZE]
[--num_train_epochs NUM_TRAIN_EPOCHS]
[--max_train_steps MAX_TRAIN_STEPS]
[--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
[--gradient_checkpointing]
[--learning_rate LEARNING_RATE]
[--scale_lr]
[--lr_scheduler LR_SCHEDULER]
[--lr_warmup_steps LR_WARMUP_STEPS]
[--dataloader_num_workers DATALOADER_NUM_WORKERS]
[--adam_beta1 ADAM_BETA1]
[--adam_beta2 ADAM_BETA2]
[--adam_weight_decay ADAM_WEIGHT_DECAY]
[--adam_epsilon ADAM_EPSILON]
[--push_to_hub]
[--hub_token HUB_TOKEN]
[--hub_model_id HUB_MODEL_ID]
[--logging_dir LOGGING_DIR]
[--mixed_precision {no,fp16,bf16}]
[--allow_tf32]
[--report_to REPORT_TO]
[--validation_prompt VALIDATION_PROMPT]
[--num_validation_images NUM_VALIDATION_IMAGES]
[--validation_steps VALIDATION_STEPS]
[--validation_epochs VALIDATION_EPOCHS]
[--local_rank LOCAL_RANK]
[--checkpointing_steps CHECKPOINTING_STEPS]
[--checkpoints_total_limit CHECKPOINTS_TOTAL_LIMIT]
[--resume_from_checkpoint RESUME_FROM_CHECKPOINT]
[--enable_xformers_memory_efficient_attention]
[--use_augmentations]
textual_inversion.py: error: unrecognized arguments: DATA_DIR

Could you help?

Are you using $'s with your environment variables? You have to pass them as $DATA_DIR if you export them using export DATA_DIR=...

Thank you @lukemelas.
As you said, I corrected the above error. But now I am getting a new error.
2023-07-26 10:27:32.459138: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/content/realfusion/textual-inversion/textual_inversion.py", line 925, in
main()
File "/content/realfusion/textual-inversion/textual_inversion.py", line 574, in main
accelerator = Accelerator(
TypeError: Accelerator.init() got an unexpected keyword argument 'logging_dir'

I appreciate your assistance @lukemelas.
Thank you

Hello, this is because accelerate had a breaking change in a recent update. You can either downgrade accelerate or change logging_dir to match the new api (see https://huggingface.co/docs/accelerate/v0.21.0/en/usage_guides/tracking#integrated-trackers)

Hope this helps!

Hello @lukemelas ,
I did not get exactly how to do the above steps.
How do I downgrade accelerate to match the logging_dir?

I checked my accelerate version. It is :
Name: accelerate
Version: 0.21.0

  1. How do I downgrade this version?

  2. How do I change the logging_dir to match the new api? I tried reading the document you referred above, could not understand much.

Thank you for your replies.

Hello @lukemelas,

  1. As you mentioned I downgraded accelerate version to 18.0 - Gave version error. The code asked me to use version <=20.3.

  2. I then upgraded to 20.3 version, it still gave the same error i.e TypeError: Accelerator.init() got an unexpected keyword argument 'logging_dir'.
    This problem is persisting even when I am using 21.0 or 20.3 version.

I tried resolving this error by doing the following:
A) I commented line 572 i.e logging_dir = os.path.join(args.output_dir, args.logging_dir)
B) I then added an argument on line 574 i.e accelerator_project_config = ProjectConfiguration(total_limit=args.checkpoints_total_limit, logging_dir=args.logging_dir)
Earlier it was just- accelerator_project_config = ProjectConfiguration(total_limit=args.checkpoints_total_limit)

After running this, the model went into training mode, however, now there is new error as follows:
07/27/2023 06:48:52 - INFO - main - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: no

{'timestep_spacing', 'prediction_type', 'thresholding', 'clip_sample_range', 'dynamic_thresholding_ratio', 'sample_max_value', 'variance_type'} was not found in config. Values will be initialized to default values.
{'force_upcast', 'scaling_factor'} was not found in config. Values will be initialized to default values.
{'cross_attention_norm', 'time_embedding_dim', 'use_linear_projection', 'addition_embed_type', 'conv_out_kernel', 'resnet_out_scale_factor', 'encoder_hid_dim', 'dual_cross_attention', 'transformer_layers_per_block', 'resnet_time_scale_shift', 'num_class_embeds', 'time_embedding_act_fn', 'encoder_hid_dim_type', 'time_embedding_type', 'class_embed_type', 'addition_time_embed_dim', 'conv_in_kernel', 'mid_block_only_cross_attention', 'projection_class_embeddings_input_dim', 'time_cond_proj_dim', 'only_cross_attention', 'addition_embed_type_num_heads', 'class_embeddings_concat', 'upcast_attention', 'mid_block_type', 'num_attention_heads', 'timestep_post_act', 'resnet_skip_time_act'} was not found in config. Values will be initialized to default values.
07/27/2023 06:49:11 - INFO - main - ***** Running training *****
07/27/2023 06:49:11 - INFO - main - Num examples = 500
07/27/2023 06:49:11 - INFO - main - Num Epochs = 24
07/27/2023 06:49:11 - INFO - main - Instantaneous batch size per device = 1
07/27/2023 06:49:11 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 4
07/27/2023 06:49:11 - INFO - main - Gradient Accumulation steps = 4
07/27/2023 06:49:11 - INFO - main - Total optimization steps = 3000
Steps: 0% 1/3000 [00:11<9:55:00, 11.90s/it, loss=0.00327, lr=0.002]Traceback (most recent call last):
File "/content/realfusion/textual-inversion/textual_inversion.py", line 925, in
main()
File "/content/realfusion/textual-inversion/textual_inversion.py", line 823, in main
for step, batch in enumerate(train_dataloader):
File "/usr/local/lib/python3.10/dist-packages/accelerate/data_loader.py", line 394, in iter
next_batch = next(dataloader_iter)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 677, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/realfusion/textual-inversion/textual_inversion.py", line 514, in getitem
image = Image.open(self.image_paths[i % self.num_images])
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3227, in open
fp = builtins.open(filename, "rb")
IsADirectoryError: [Errno 21] Is a directory: '/content/realfusion/examples/natural-images/sofa/.ipynb_checkpoints'
Steps: 0% 1/3000 [00:12<10:08:58, 12.18s/it, loss=0.00327, lr=0.002]

How do I go about this error and was the addition on line 572 the correct way to go ahead?

  1. I then read the link you provided. As per my understanding, tensorboard is being used in the log_with parameter, accelerator.init_trackers is given correctly, accelerator.log is also given correctly, accelerator.end_training() is also given correctly.

  2. However, when it comes to creating a learned_embeds.bin file, it is not getting created. I am not understanding why it is so.

  3. Also in line 570 of textual_inversion.py: logging_dir = os.path.join(args.output_dir, args.logging_dir)
    Nowhere, in the github documented code we are giving the logging_dir path like we are passing the output_dir and data_dir path, how will line 570 then concatenate logging_dir if not mentioned by us? Is my understanding right. Kindly correct me if I am wrong.

  4. Also I did not get the part when you mentioned - change logging_dir to match the new api.

  5. How do I get past this step?

Kindly assist.

Thank you

Hi @lukemelas,

I am getting this error for the last 3 days:

  1. If I just run the code from your github repo as it is, google colab gives the below error:
    2023-07-31 05:50:26.676773: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
    Traceback (most recent call last):
    File "/content/realfusion/textual-inversion/textual_inversion.py", line 926, in
    main()
    File "/content/realfusion/textual-inversion/textual_inversion.py", line 574, in main
    accelerator = Accelerator(
    TypeError: Accelerator.init() got an unexpected keyword argument 'logging_dir'

  2. When I change the code like the way I have done below:
    def main():
    args = parse_args()
    #logging_dir = os.path.join(args.output_dir, args.logging_dir)

    accelerator_project_config = ProjectConfiguration(total_limit=args.checkpoints_total_limit, logging_dir=os.path.join(args.output_dir, args.logging_dir))

    accelerator = Accelerator(
    gradient_accumulation_steps=args.gradient_accumulation_steps,
    mixed_precision=args.mixed_precision,
    log_with=args.report_to,
    #logging_dir=logging_dir,
    project_config=accelerator_project_config,
    )

It gives another error that is:
2023-07-31 06:16:03.209058: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
07/31/2023 06:16:06 - INFO - main - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: no

Traceback (most recent call last):
File "/content/realfusion/textual-inversion/textual_inversion.py", line 926, in
main()
File "/content/realfusion/textual-inversion/textual_inversion.py", line 621, in main
os.makedirs(args.output_dir, exist_ok=True)
File "/usr/lib/python3.10/os.py", line 225, in makedirs
mkdir(name, mode)
FileNotFoundError: [Errno 2] No such file or directory: ''

  1. Since I was not able to solve these 2 issues, I went ahead to run 'Side note: Textual Inversion Initialization' code snippet. It generated tokens. Then I ran python main.py --0. It took some time to execute and towards the end it gave a message saying the following:
    Traceback (most recent call last):
    File "/content/realfusion/main.py", line 163, in
    main()
    File "/content/realfusion/main.py", line 102, in main
    add_tokens_to_model_from_path(
    File "/content/realfusion/sd/utils.py", line 40, in add_tokens_to_model_from_path
    add_tokens_to_model(learned_embeds, text_encoder, tokenizer, override_token)
    File "/content/realfusion/sd/utils.py", line 15, in add_tokens_to_model
    embedding = embedding.to(text_encoder.get_input_embeddings().weight.dtype)
    AttributeError: 'tuple' object has no attribute 'get_input_embeddings'

I am not able to understand how to exactly solve these 2 issues. I tried understanding your code, but nothing fruitful.

@lukemelas I would really need your assistance in these questions or anyone who has implemented his code from the repo.

Thank you

Warm regards,
Anand Varrier

Hi @anandvarrier and @lukemelas . I am getting the same error as you with the accelerate package and the logging directory error. Did you manage to get it to work?

@anandvarrier For your point 3. I think you need to upgrade diffusers to at least 0.15.0 . If you run pip install diffusers==0.15.0 then it should work properly and resolve the error.

Hi @lilyuam,
No I was not able to pass through that error. I will try your solution. Were you able to go ahead and run the code?
Thank you for the response.

@anandvarrier @lilyuam @lukemelas
I got the solution for this, delete logging_dir=logging_dir in line 578, textual_inversion.py, and add this as a parameter in line 572, the code should be like this:
def main():
args = parse_args()
logging_dir = os.path.join(args.output_dir, args.logging_dir)

accelerator_project_config = ProjectConfiguration(total_limit=args.checkpoints_total_limit,logging_dir=logging_dir)

accelerator = Accelerator(
    gradient_accumulation_steps=args.gradient_accumulation_steps,
    mixed_precision=args.mixed_precision,
    log_with=args.report_to,
    project_config=accelerator_project_config,
)