CUFFT-type error when running huggingface.py to generate embeddings
salvatoreloguercio opened this issue · comments
Hello,
I am using a slightly modified version of the huggingface.py script to generate embeddings from fasta files. I am using the largest model (1Mb window size), and running it on a A100 80Gb.
I just added a loop ad the end of the huggingface.py which loads fasta files and gets embeddings:
for record in records:
print(record.id)
sequence = str(record.seq)[0:max_length]
tok_seq = tokenizer(sequence)
tok_seq = tok_seq["input_ids"] # grab ids
# place on device, convert to tensor
tok_seq = torch.LongTensor(tok_seq).unsqueeze(0) # unsqueeze for batch dim
tok_seq = tok_seq.to(device)
# prep model and forward
model.to(device)
model.eval()
with torch.inference_mode():
embeddings = model(tok_seq)
However, after a few hundred iterations I get the following CUFFT error, which seems related to out of memory issues:
Traceback (most recent call last):
File "huggingface_1Mbp.py", line 271, in <module>
embeddings = model(tok_seq)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hyena-dna/standalone_hyenadna.py", line 914, in forward
hidden_states = self.backbone(input_ids, position_ids=position_ids)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hyena-dna/standalone_hyenadna.py", line 728, in forward
hidden_states, residual = layer(hidden_states, residual)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hyena-dna/standalone_hyenadna.py", line 530, in forward
hidden_states = self.mixer(hidden_states, **mixer_kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hyena-dna/standalone_hyenadna.py", line 288, in forward
v = self.filter_fn(v, l_filter, k=k[o], bias=bias[o])
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hyena-dna/standalone_hyenadna.py", line 222, in forward
y = fftconv(x, k, bias)
File "/home/hyena-dna/standalone_hyenadna.py", line 53, in fftconv
k_f = torch.fft.rfft(k, n=fft_size) / fft_size
RuntimeError: cuFFT error: CUFFT_ALLOC_FAILED
So I was wondering, if there is a way to flush the memory between iterations, in order to prevent this kind of error?
Thanks!
It might be related to cuda 1.17: see https://discord.com/channels/1125706816479821874/1125706817016696926/1128087480021823518
Where you able to solve this? I run into similar issues, randomly while doing inference (I cannot access the discord link btw):
return forward_call(*args, **kwargs) [499/1100]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/.xdg_cache_gbenegas/huggingface/modules/transformers_modules/LongSafari/hyenadna-large-1m-seqlen-hf/8eb99a87c0bbaf0fec9346d72c60360c3a5b9e33/modeling_hyena.py
", line 158, in forward
y = fftconv(x, k, bias)
^^^^^^^^^^^^^^^^^^^
File "/tmp/.xdg_cache_gbenegas/huggingface/modules/transformers_modules/LongSafari/hyenadna-large-1m-seqlen-hf/8eb99a87c0bbaf0fec9346d72c60360c3a5b9e33/modeling_hyena.py
", line 26, in fftconv
y = torch.fft.irfft(u_f * k_f, n=fft_size, norm='forward')[..., :seqlen]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR
No, I ended up using the model for fewer iterations, then reloading the image. Just a workaround.