DefaultCPUAllocator: can't allocate memory during perplexity.compute for Llama 2
RinaAkay opened this issue · comments
RinaA commented
I have been trying to use evaluate library for perplexity.compute for Llama2 and GPTNEOX
The server I am using had adequate resources NVIDIA - GV100GL [Tesla V100 PCIe 32GB] for the error I am getting below.
Do you suggestion or requirements to run this task? Are you planing to add some quantization as a parameter to perplexity.compute(model_id=model_id, predictions=predictions)
when model id is specified. I appreciate if you could help me on this. Thanks.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/anaconda3/envs/lib/python3.11/site-packages/evaluate/module.py", line 462, in compute
output = self._compute(**inputs, **compute_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--perplexity/8ab643ad86f568b7d1d5f7822373fa7401ff5ff0297ccf114b0ca6a33be96bc0/perplexity.py", line 114, in _compute
model = AutoModelForCausalLM.from_pretrained(model_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3236, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 711, in __init__
self.gpt_neox = GPTNeoXModel(config)
^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 540, in __init__
self.layers = nn.ModuleList([GPTNeoXLayer(config) for _ in range(config.num_hidden_layers)])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 540, in <listcomp>
self.layers = nn.ModuleList([GPTNeoXLayer(config) for _ in range(config.num_hidden_layers)])
^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 428, in __init__
self.mlp = GPTNeoXMLP(config)
^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 409, in __init__
self.dense_4h_to_h = nn.Linear(config.intermediate_size, config.hidden_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 96, in __init__
self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 603979776 bytes. Error code 12 (Cannot allocate memory)