frank-xwang / InstanceDiffusion

Could you provide a new instruct? or some example of the training data

Can you provide the full command you used for running the experiments and the error message? Thanks.

Thanks you,my gpu memory is not encough,and I just change my training strategy.

Line 167 in dadf0e3

    
           grounding_input = self.grounding_tokenizer_input.prepare(batch, return_att_masks=self.config.use_masked_att)

if I don't want to add this when training the unet model,what should I do?Is there a very simple way to do this?

Sorry for the late reply. I am a little bit confused, you mean you want to train the model without any grounding inputs? Our model training needs to have instance/part-level location and captioning inputs, otherwise, it should be equivalent to directly fine-tuning the Stable Diffusion model.

Yes I want to train the model without any grounding inputs,I just train a lora model ,when sampling image,add the grounding inputs,it seems work a litle bit.

Oh, I see. Maybe you can provide zero tensors as a placeholder for bbox, masks and instance caption embeddings. The easies way might be calling 'self.grounding_tokenizer_input.get_null_input()'. You should manually set 'self.set=True', and provide the 'self.device', 'self.dtype', 'self.max_box', etc.

You can find more details on this function at 'grounding_input/text_grounding_tokinzer_input.py'

Other ways to training the model