oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

API OutOfMemory Error

kupertdev opened this issue · comments

Describe the bug

When trying to send API request for text generation I get OutOfMemory. At the same time text generation in the browser goes smoothly and quietly.

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Session > api > restart > requesting text generation to api

Screenshot

No response

Logs

Traceback (most recent call last):
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/modules/callbacks.py", line 61, in gentask
    ret = self.mfunc(callback=_callback, *args, **self.kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/modules/text_generation.py", line 398, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/installer_files/env/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/generation/utils.py", line 1922, in generate
    self._prepare_cache_for_generation(
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/generation/utils.py", line 1566, in _prepare_cache_for_generation
    model_kwargs[cache_name] = self._get_cache(
                               ^^^^^^^^^^^^^^^^
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/generation/utils.py", line 1476, in _get_cache
    self._cache = cache_cls(**cache_kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kupert/Документы/AI_TOOLS/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/cache_utils.py", line 1614, in __init__
    new_layer_value_cache = torch.zeros(cache_shape, dtype=self.dtype, device=layer_device)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

System Info

Linux Ubuntu 22.04