Notebook Crashes when loading model to CUDA
souvik0306 opened this issue · comments
Is this a new bug?
- I believe this is a new bug
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Hi, I was executing your notebook, but somehow my Kernel crashes every time I try to load the model. I am using an NVIDIA GeForce RTX 4090 GPU. The point at which it crashes -
from torch import cuda, bfloat16
import transformers
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'
model = transformers.AutoModelForCausalLM.from_pretrained(
'mosaicml/mpt-7b-instruct',
trust_remote_code=True,
torch_dtype=bfloat16,
max_seq_len=2048
)
model.eval()
model.to(device)
print(f"Model loaded on {device}")
I am using the latest version of the transformers library and the model checkpoint. I have also ensured that all the dependencies are up to date.
Any assistance would be greatly appreciated. Thank you!
Expected Behavior
Model should get loaded on CUDA
Steps To Reproduce
Ran it as the same way as your code.
If possible can you confirm if you were using Colab Pro?
Relevant log output
No response
Environment
- **OS**: Ubuntu
- **Language version**: Python3
My System Specs -
NVIDIA GeForce RTX 4090
PyTorch -2.0.1,
Cuda 11.7
Additional Context
No response
the same is happening to me
Hello, it appears that your instance lacks sufficient RAM capacity to load the entire model into memory. To resolve this, you can execute our notebooks using Google Colab with a modified Runtime Environment. I recommend setting the Hardware accelerator to GPU, selecting T4 as the GPU type, and configuring the Runtime Shape to "High RAM."
If the issue persists, consider running the notebook on a larger GPU instance.