DefaultCPUAllocator: can't allocate memory during perplexity.compute for Llama 2

Question

DefaultCPUAllocator: can't allocate memory during perplexity.compute for Llama 2

RinaAkay opened this issue 6 months ago · comments

I have been trying to use evaluate library for perplexity.compute for Llama2 and GPTNEOX
The server I am using had adequate resources NVIDIA - GV100GL [Tesla V100 PCIe 32GB] for the error I am getting below.

Do you suggestion or requirements to run this task? Are you planing to add some quantization as a parameter to perplexity.compute(model_id=model_id, predictions=predictions) when model id is specified. I appreciate if you could help me on this. Thanks.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/anaconda3/envs/lib/python3.11/site-packages/evaluate/module.py", line 462, in compute
    output = self._compute(**inputs, **compute_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--perplexity/8ab643ad86f568b7d1d5f7822373fa7401ff5ff0297ccf114b0ca6a33be96bc0/perplexity.py", line 114, in _compute
    model = AutoModelForCausalLM.from_pretrained(model_id)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3236, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 711, in __init__
    self.gpt_neox = GPTNeoXModel(config)
                    ^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 540, in __init__
    self.layers = nn.ModuleList([GPTNeoXLayer(config) for _ in range(config.num_hidden_layers)])
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 540, in <listcomp>
    self.layers = nn.ModuleList([GPTNeoXLayer(config) for _ in range(config.num_hidden_layers)])
                                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 428, in __init__
    self.mlp = GPTNeoXMLP(config)
               ^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 409, in __init__
    self.dense_4h_to_h = nn.Linear(config.intermediate_size, config.hidden_size)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/anaconda3/envs/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 96, in __init__
    self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 603979776 bytes. Error code 12 (Cannot allocate memory)