kyegomez / LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

Home Page:https://discord.gg/qUtxnK2NMf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OutOfMemoryError

Qembo154 opened this issue · comments

Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda

OutOfMemoryError Traceback (most recent call last)
in <cell line: 22>()
20 #create model and data
21 model = DilatedAttention(d_model, num_heads, dilation_rate, segment_size).to(device)
---> 22 x = torch.randn((batch_size, seq_len, d_model), device=device, dtype=dtype)
23
24

OutOfMemoryError: CUDA out of memory. Tried to allocate 305.18 GiB (GPU 0; 14.75 GiB total capacity; 16.00 KiB already allocated; 14.09 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

@Qembo154 Hey you're gpu is out of memory. Try again