encoder_attention_mask error
kotaxyz opened this issue · comments
hello there ,thanks alot for this awesome tool.
i have a problem running it , i get this error
note that i use cpu method ,cause i get oom error when i use gpu cause i have only 8gb vram
PS C:\Users\Genesis\anaconda3\tango> & C:/Users/Genesis/anaconda3/envs/tango/python.exe c:/Users/Genesis/anaconda3/tango/generate.py
Fetching 8 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<?, ?it/s]
c:\Users\Genesis\anaconda3\tango\audioldm\audio\stft.py:42: FutureWarning: Pass size=1024 as keyword args. From version 0.10 passing these as positional arguments will result in an error
fft_window = pad_center(fft_window, filter_length)
c:\Users\Genesis\anaconda3\tango\audioldm\audio\stft.py:151: FutureWarning: Pass sr=16000, n_fft=1024, n_mels=64, fmin=0, fmax=8000 as keyword args. From version 0.10 passing these as positional arguments will result in an error
mel_basis = librosa_mel_fn(
UNet initialized randomly.
Some weights of the model checkpoint at google/flan-t5-large were not used when initializing T5EncoderModel: ['decoder.block.16.layer.0.SelfAttention.v.weight', 'decoder.block.14.layer.0.layer_norm.weight', 'decoder.block.7.layer.0.layer_norm.weight', 'decoder.block.10.layer.1.layer_norm.weight', 'decoder.block.13.layer.0.SelfAttention.v.weight', 'decoder.block.20.layer.2.DenseReluDense.wo.weight', 'decoder.block.6.layer.1.EncDecAttention.v.weight', 'decoder.block.16.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.0.layer_norm.weight', 'decoder.block.8.layer.2.layer_norm.weight', 'decoder.block.16.layer.1.EncDecAttention.v.weight', 'decoder.block.13.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.3.layer.1.EncDecAttention.k.weight', 'decoder.block.4.layer.0.SelfAttention.q.weight', 'decoder.block.1.layer.0.layer_norm.weight', 'decoder.block.20.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.10.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.1.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.1.layer_norm.weight', 'decoder.block.1.layer.2.DenseReluDense.wo.weight', 'decoder.block.13.layer.0.SelfAttention.q.weight', 'decoder.block.16.layer.1.EncDecAttention.o.weight', 'decoder.block.2.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.3.layer.0.SelfAttention.v.weight', 'decoder.block.16.layer.0.layer_norm.weight', 'decoder.block.11.layer.1.EncDecAttention.k.weight', 'decoder.block.18.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.17.layer.1.EncDecAttention.o.weight', 'decoder.block.7.layer.1.EncDecAttention.v.weight', 'decoder.block.11.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.13.layer.0.SelfAttention.k.weight', 'decoder.block.3.layer.0.SelfAttention.o.weight', 'decoder.block.13.layer.2.layer_norm.weight', 'decoder.block.12.layer.0.SelfAttention.k.weight', 'decoder.block.18.layer.2.DenseReluDense.wo.weight', 'decoder.block.15.layer.2.DenseReluDense.wo.weight', 'decoder.block.20.layer.0.SelfAttention.v.weight', 'decoder.block.4.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.14.layer.1.EncDecAttention.o.weight', 'decoder.block.7.layer.0.SelfAttention.o.weight', 'decoder.block.20.layer.0.SelfAttention.q.weight', 'decoder.block.21.layer.0.SelfAttention.k.weight', 'decoder.block.8.layer.0.SelfAttention.v.weight', 'decoder.block.20.layer.2.layer_norm.weight', 'decoder.block.23.layer.1.EncDecAttention.q.weight', 'decoder.block.17.layer.1.EncDecAttention.q.weight', 'decoder.block.4.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.10.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.1.EncDecAttention.v.weight', 'decoder.block.5.layer.1.EncDecAttention.k.weight', 'decoder.block.5.layer.0.SelfAttention.k.weight', 'decoder.block.8.layer.1.EncDecAttention.k.weight', 'decoder.block.8.layer.0.SelfAttention.o.weight', 'decoder.block.9.layer.1.layer_norm.weight', 'decoder.block.14.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.20.layer.1.EncDecAttention.k.weight', 'decoder.block.0.layer.1.EncDecAttention.o.weight', 'decoder.block.4.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.1.EncDecAttention.o.weight', 'decoder.block.15.layer.1.EncDecAttention.v.weight', 'decoder.block.6.layer.2.layer_norm.weight', 'decoder.block.12.layer.0.SelfAttention.o.weight', 'decoder.block.13.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.1.layer_norm.weight', 'decoder.block.4.layer.1.EncDecAttention.o.weight', 'decoder.block.2.layer.1.EncDecAttention.o.weight', 'decoder.block.18.layer.0.SelfAttention.k.weight', 'decoder.block.16.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.11.layer.2.DenseReluDense.wo.weight', 'decoder.block.13.layer.1.EncDecAttention.o.weight', 'decoder.block.20.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.7.layer.0.SelfAttention.k.weight', 'decoder.block.13.layer.1.EncDecAttention.v.weight', 'decoder.block.23.layer.0.layer_norm.weight', 'decoder.block.17.layer.0.SelfAttention.o.weight', 'decoder.block.15.layer.0.SelfAttention.q.weight', 'decoder.block.17.layer.0.layer_norm.weight', 'decoder.block.11.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.1.layer.0.SelfAttention.q.weight', 'decoder.block.16.layer.0.SelfAttention.k.weight', 'decoder.block.7.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.8.layer.0.SelfAttention.k.weight', 'decoder.block.20.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.13.layer.0.SelfAttention.o.weight', 'decoder.block.11.layer.2.layer_norm.weight', 'decoder.block.1.layer.1.layer_norm.weight', 'decoder.block.20.layer.1.EncDecAttention.q.weight', 'decoder.block.22.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.0.SelfAttention.k.weight', 'decoder.block.18.layer.1.EncDecAttention.v.weight', 'decoder.block.9.layer.0.layer_norm.weight', 'decoder.block.23.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.2.layer.0.SelfAttention.k.weight', 'decoder.block.4.layer.2.layer_norm.weight',
'decoder.block.1.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.13.layer.0.layer_norm.weight', 'decoder.block.2.layer.1.layer_norm.weight', 'decoder.block.6.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.10.layer.1.EncDecAttention.o.weight', 'decoder.block.21.layer.2.DenseReluDense.wo.weight', 'decoder.block.3.layer.1.EncDecAttention.q.weight', 'decoder.block.12.layer.1.EncDecAttention.q.weight', 'decoder.block.0.layer.2.layer_norm.weight', 'decoder.block.19.layer.0.SelfAttention.k.weight', 'decoder.block.8.layer.0.SelfAttention.q.weight', 'decoder.block.18.layer.0.SelfAttention.v.weight', 'decoder.block.6.layer.1.EncDecAttention.o.weight', 'decoder.block.5.layer.2.DenseReluDense.wo.weight', 'decoder.block.8.layer.0.layer_norm.weight', 'decoder.block.21.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.2.layer_norm.weight', 'decoder.block.22.layer.2.layer_norm.weight', 'decoder.block.5.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.9.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.15.layer.1.EncDecAttention.o.weight', 'decoder.block.23.layer.2.layer_norm.weight', 'decoder.block.19.layer.1.EncDecAttention.q.weight', 'decoder.block.15.layer.1.EncDecAttention.k.weight', 'decoder.block.19.layer.0.layer_norm.weight', 'decoder.block.17.layer.2.layer_norm.weight', 'decoder.block.18.layer.1.layer_norm.weight', 'decoder.block.21.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.20.layer.1.layer_norm.weight', 'decoder.block.3.layer.1.layer_norm.weight', 'decoder.block.5.layer.0.SelfAttention.o.weight', 'decoder.block.3.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.16.layer.1.layer_norm.weight', 'decoder.block.11.layer.1.layer_norm.weight', 'decoder.block.3.layer.0.SelfAttention.k.weight', 'decoder.block.9.layer.0.SelfAttention.k.weight', 'decoder.block.20.layer.1.EncDecAttention.o.weight', 'decoder.block.19.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.4.layer.1.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.v.weight', 'decoder.block.18.layer.0.layer_norm.weight', 'decoder.block.23.layer.0.SelfAttention.q.weight', 'decoder.block.0.layer.1.EncDecAttention.q.weight', 'decoder.block.16.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.0.SelfAttention.o.weight', 'decoder.block.21.layer.1.EncDecAttention.o.weight', 'decoder.block.6.layer.0.SelfAttention.v.weight', 'decoder.block.19.layer.2.layer_norm.weight', 'decoder.block.8.layer.1.EncDecAttention.q.weight', 'decoder.block.10.layer.1.EncDecAttention.k.weight', 'decoder.block.4.layer.2.DenseReluDense.wo.weight', 'decoder.block.17.layer.1.EncDecAttention.v.weight', 'decoder.block.7.layer.1.EncDecAttention.q.weight', 'decoder.block.13.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.18.layer.1.EncDecAttention.o.weight', 'decoder.block.2.layer.2.layer_norm.weight', 'decoder.block.0.layer.0.layer_norm.weight', 'decoder.block.7.layer.1.EncDecAttention.o.weight', 'decoder.block.8.layer.2.DenseReluDense.wo.weight', 'decoder.embed_tokens.weight', 'decoder.block.12.layer.1.EncDecAttention.v.weight', 'decoder.block.7.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.1.EncDecAttention.v.weight', 'decoder.block.2.layer.0.SelfAttention.o.weight', 'decoder.block.22.layer.0.layer_norm.weight', 'decoder.block.6.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.23.layer.2.DenseReluDense.wo.weight', 'decoder.block.5.layer.1.EncDecAttention.o.weight', 'decoder.block.11.layer.1.EncDecAttention.o.weight', 'decoder.block.3.layer.1.EncDecAttention.o.weight', 'decoder.block.14.layer.0.SelfAttention.k.weight', 'decoder.block.14.layer.2.layer_norm.weight', 'decoder.block.7.layer.0.SelfAttention.v.weight', 'decoder.block.4.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.16.layer.0.SelfAttention.q.weight', 'decoder.block.9.layer.2.DenseReluDense.wo.weight', 'decoder.block.7.layer.2.DenseReluDense.wo.weight', 'decoder.block.22.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.0.SelfAttention.o.weight', 'decoder.block.19.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.1.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.4.layer.1.EncDecAttention.k.weight', 'decoder.block.20.layer.0.layer_norm.weight', 'decoder.block.10.layer.0.layer_norm.weight', 'decoder.block.6.layer.0.SelfAttention.q.weight', 'decoder.block.12.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.3.layer.0.SelfAttention.q.weight', 'decoder.block.10.layer.0.SelfAttention.q.weight', 'decoder.block.9.layer.0.SelfAttention.o.weight', 'decoder.block.0.layer.0.SelfAttention.k.weight', 'decoder.block.6.layer.0.layer_norm.weight', 'decoder.block.13.layer.2.DenseReluDense.wo.weight', 'decoder.block.15.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.17.layer.1.EncDecAttention.k.weight', 'decoder.block.8.layer.1.EncDecAttention.v.weight', 'decoder.block.17.layer.2.DenseReluDense.wo.weight', 'decoder.block.18.layer.2.layer_norm.weight', 'decoder.block.6.layer.2.DenseReluDense.wo.weight', 'decoder.block.10.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.0.SelfAttention.q.weight', 'decoder.block.1.layer.1.EncDecAttention.q.weight', 'decoder.block.7.layer.2.layer_norm.weight', 'decoder.block.7.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.0.layer_norm.weight', 'decoder.block.10.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.14.layer.1.EncDecAttention.v.weight', 'decoder.block.18.layer.1.EncDecAttention.k.weight', 'decoder.block.10.layer.0.SelfAttention.v.weight', 'decoder.block.20.layer.0.SelfAttention.o.weight', 'decoder.block.15.layer.0.SelfAttention.o.weight', 'decoder.block.12.layer.0.SelfAttention.q.weight', 'decoder.block.1.layer.0.SelfAttention.o.weight', 'decoder.block.2.layer.0.SelfAttention.v.weight', 'decoder.block.4.layer.1.EncDecAttention.q.weight', 'decoder.block.0.layer.2.DenseReluDense.wo.weight', 'decoder.block.19.layer.1.EncDecAttention.o.weight', 'decoder.block.23.layer.1.EncDecAttention.k.weight', 'decoder.block.5.layer.0.SelfAttention.q.weight', 'decoder.block.12.layer.0.layer_norm.weight', 'decoder.block.14.layer.1.EncDecAttention.k.weight', 'decoder.block.17.layer.0.SelfAttention.q.weight', 'decoder.block.21.layer.2.layer_norm.weight', 'decoder.block.10.layer.1.EncDecAttention.q.weight', 'decoder.block.19.layer.0.SelfAttention.q.weight', 'decoder.block.5.layer.0.SelfAttention.v.weight', 'decoder.block.19.layer.1.EncDecAttention.k.weight', 'decoder.block.1.layer.0.SelfAttention.v.weight', 'decoder.block.12.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.18.layer.1.EncDecAttention.q.weight', 'decoder.block.12.layer.2.DenseReluDense.wo.weight', 'decoder.block.15.layer.0.SelfAttention.k.weight', 'decoder.block.3.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.7.layer.1.EncDecAttention.k.weight', 'decoder.block.12.layer.1.EncDecAttention.o.weight', 'decoder.block.16.layer.1.EncDecAttention.k.weight', 'decoder.block.14.layer.0.SelfAttention.v.weight', 'decoder.block.23.layer.1.layer_norm.weight', 'decoder.block.1.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.1.EncDecAttention.k.weight', 'decoder.block.16.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.6.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.2.DenseReluDense.wo.weight', 'decoder.block.12.layer.1.EncDecAttention.k.weight', 'decoder.block.15.layer.0.layer_norm.weight', 'decoder.block.22.layer.1.EncDecAttention.v.weight', 'decoder.block.23.layer.0.SelfAttention.k.weight', 'decoder.block.12.layer.2.layer_norm.weight', 'decoder.block.7.layer.1.layer_norm.weight', 'decoder.block.0.layer.1.layer_norm.weight', 'decoder.block.5.layer.2.layer_norm.weight', 'decoder.block.22.layer.0.SelfAttention.v.weight', 'decoder.block.11.layer.0.layer_norm.weight', 'decoder.block.1.layer.0.SelfAttention.k.weight', 'decoder.block.13.layer.1.layer_norm.weight', 'decoder.block.23.layer.0.SelfAttention.o.weight', 'decoder.block.17.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.15.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.0.layer_norm.weight', 'decoder.block.12.layer.1.layer_norm.weight', 'decoder.block.19.layer.2.DenseReluDense.wo.weight', 'decoder.block.22.layer.0.SelfAttention.o.weight', 'lm_head.weight', 'decoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight', 'decoder.block.11.layer.1.EncDecAttention.q.weight', 'decoder.block.16.layer.2.layer_norm.weight', 'decoder.block.14.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.19.layer.0.SelfAttention.o.weight', 'decoder.block.18.layer.2.DenseReluDense.wi_1.weight', 'decoder.final_layer_norm.weight', 'decoder.block.10.layer.0.SelfAttention.o.weight', 'decoder.block.17.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.15.layer.2.layer_norm.weight', 'decoder.block.19.layer.1.layer_norm.weight', 'decoder.block.20.layer.1.EncDecAttention.v.weight', 'decoder.block.10.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.1.EncDecAttention.q.weight', 'decoder.block.14.layer.1.EncDecAttention.q.weight', 'decoder.block.23.layer.1.EncDecAttention.v.weight', 'decoder.block.22.layer.2.DenseReluDense.wo.weight', 'decoder.block.12.layer.0.SelfAttention.v.weight', 'decoder.block.1.layer.1.EncDecAttention.o.weight', 'decoder.block.21.layer.0.SelfAttention.q.weight', 'decoder.block.11.layer.0.SelfAttention.o.weight', 'decoder.block.18.layer.0.SelfAttention.o.weight', 'decoder.block.5.layer.0.layer_norm.weight', 'decoder.block.21.layer.0.SelfAttention.v.weight', 'decoder.block.9.layer.1.EncDecAttention.v.weight', 'decoder.block.2.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.9.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.18.layer.0.SelfAttention.q.weight', 'decoder.block.8.layer.1.EncDecAttention.o.weight', 'decoder.block.17.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.2.DenseReluDense.wo.weight', 'decoder.block.15.layer.1.layer_norm.weight', 'decoder.block.0.layer.0.SelfAttention.v.weight', 'decoder.block.21.layer.1.EncDecAttention.q.weight', 'decoder.block.3.layer.2.layer_norm.weight', 'decoder.block.16.layer.1.EncDecAttention.q.weight', 'decoder.block.9.layer.1.EncDecAttention.q.weight', 'decoder.block.15.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.14.layer.1.layer_norm.weight', 'decoder.block.1.layer.1.EncDecAttention.v.weight', 'decoder.block.9.layer.2.layer_norm.weight', 'decoder.block.2.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.0.SelfAttention.v.weight', 'decoder.block.19.layer.1.EncDecAttention.v.weight', 'decoder.block.3.layer.0.layer_norm.weight', 'decoder.block.2.layer.2.DenseReluDense.wo.weight', 'decoder.block.4.layer.0.SelfAttention.v.weight', 'decoder.block.0.layer.0.SelfAttention.o.weight', 'decoder.block.23.layer.0.SelfAttention.v.weight', 'decoder.block.14.layer.0.SelfAttention.q.weight',
'decoder.block.22.layer.1.EncDecAttention.q.weight', 'decoder.block.15.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.1.EncDecAttention.v.weight', 'decoder.block.0.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.8.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.21.layer.1.EncDecAttention.v.weight', 'decoder.block.10.layer.2.layer_norm.weight', 'decoder.block.9.layer.1.EncDecAttention.o.weight', 'decoder.block.2.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.1.EncDecAttention.v.weight', 'decoder.block.2.layer.1.EncDecAttention.q.weight', 'decoder.block.5.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.17.layer.1.layer_norm.weight', 'decoder.block.9.layer.1.EncDecAttention.k.weight', 'decoder.block.21.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.23.layer.1.EncDecAttention.o.weight', 'decoder.block.0.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.17.layer.0.SelfAttention.k.weight', 'decoder.block.21.layer.1.EncDecAttention.k.weight', 'decoder.block.9.layer.0.SelfAttention.v.weight', 'decoder.block.11.layer.1.EncDecAttention.v.weight', 'decoder.block.5.layer.1.layer_norm.weight', 'decoder.block.9.layer.0.SelfAttention.q.weight', 'decoder.block.23.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.6.layer.1.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.q.weight', 'decoder.block.13.layer.1.EncDecAttention.k.weight', 'decoder.block.22.layer.0.SelfAttention.k.weight', 'decoder.block.6.layer.1.EncDecAttention.q.weight']
- This IS expected if you are initializing T5EncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing T5EncoderModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Successfully loaded checkpoint from: declare-lab/tango
c:\Users\Genesis\anaconda3\tango\models.py:223: FutureWarning: Accessing config attributein_channels
directly via 'UNet2DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet2DConditionModel's config object instead, e.g. 'unet.config.in_channels'.
num_channels_latents = self.unet.in_channels
Traceback (most recent call last):
File "c:\Users\Genesis\anaconda3\tango\generate.py", line 8, in
audio = tango.generate(prompt, steps=50)
File "c:\Users\Genesis\anaconda3\tango\tango.py", line 46, in generate
latents = self.model.inference([prompt], self.scheduler, steps, guidance, samples, disable_progress=disable_progress)
File "C:\Users\Genesis\anaconda3\envs\tango\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "c:\Users\Genesis\anaconda3\tango\models.py", line 234, in inference
noise_pred = self.unet(
File "C:\Users\Genesis\anaconda3\envs\tango\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
TypeError: UNet2DConditionModel.forward() got an unexpected keyword argument 'encoder_attention_mask'
Check the installation instructions in the README. You need to install diffusers from the directory provided with this repository:
cd tango/diffusers
pip install -e .
That will ensure the encoder_attention_mask
functionality in UNet2D
it worked thanks alot