Issue with local RAG setup using LM Studio and privateGPT on MacBook M2

Question

Issue with local RAG setup using LM Studio and privateGPT on MacBook M2

espritdunet opened this issue 5 months ago · comments

Hello,

I've installed privateGPT with Pyenv and Poetry on my MacBook M2 to set up a local RAG using LM Studio version 0.2.21.

I'm using the settings-vllm.yaml configuration file with the following setup:

server:
  env_name: ${APP_ENV:vllm}

llm:
  mode: openailike

embedding:
  mode: huggingface
  ingest_mode: simple

huggingface:
  embedding_hf_model_name: nomic-ai/nomic-embed-text-v1.5-GGUF

openailike:
  api_base: http://localhost:1234/v1
  api_key: lm-studio
  model: lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF

I have successfully loaded the Nomic-AI and META models into LM Studio and it's working well. Its server is started on localhost:1234 and I may test it by running:

curl http://localhost:1234/v1/models

However, when I attempt to run privateGPT with the following commands:

export PGPT_PROFILES=vllm
make run

I encounter the following error:

poetry run python -m private_gpt
19:19:07.948 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'vllm']
--- Logging error ---
Traceback (most recent call last):
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/injector/__init__.py", line 798, in get
    return self._context[key]
           ~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.ui.ui.PrivateGptUi'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/injector/__init__.py", line 798, in get
    return self._context[key]
           ~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.server.ingest.ingest_service.IngestService'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/injector/__init__.py", line 798, in get
    return self._context[key]
           ~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.components.llm.llm_component.LLMComponent'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
    response.raise_for_status()
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1374, in hf_hub_download
    raise head_call_error
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 426, in _request_wrapper
    hf_raise_for_status(response)
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
    raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-662d338f-03d9bd0715e27ec835496b21;71a24281-85dc-4848-bdc3-d2a9b2a1ef96)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Access to model mistralai/Mistral-7B-Instruct-v0.2 is restricted. You must be authenticated to access it.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/netspirit/DEV/private-gpt/private_gpt/components/llm/llm_component.py", line 30, in __init__
    AutoTokenizer.from_pretrained(
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 782, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1111, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 633, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/configuration_utils.py", line 688, in _get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/Users/netspirit/Library/Caches/pypoetry/virtualenvs/private-gpt-6oSpWw2B-py3.11/lib/python3.11/site-packages/transformers/utils/hub.py", line 416, in cached_file
    raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2.
401 Client Error. (Request ID: Root=1-662d338f-03d9bd0715e27ec835496b21;71a24281-85dc-4848-bdc3-d2a9b2a1ef96)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/main/config.json.
Access to model mistralai/Mistral-7B-Instruct-v0.2 is restricted. You must be authenticated to access it.

I'm puzzled why privateGPT is attempting to connect to OpenAI and Huggingface sites. What should I modify to correct this issue?

Any guidance would be greatly appreciated.

Thank you!

Nabelou Ouologuem · Answer 1 · Sun Apr 28 2024 13:21:03 GMT+0800 (China Standard Time)

I have the same issue

NetSpirit oHm · Answer 2 · Mon Apr 29 2024 17:27:26 GMT+0800 (China Standard Time)

After conducting further tests, I have realized that the Python 3.11 code in privateGPT does not seem to contain the necessary components for utilizing the local LM Studio server to leverage the "text embeddings" feature provided by it starting from version 0.2.19.

The settings-vllm.yaml model configuration I am using appears to lack the necessary details to connect to the local LM Studio server for using its new Text Embeddings function.

Here's some additional information:

I've consulted the LM Studio documentation available at: LM Studio Text Embeddings Documentation. My local tests for the connection work properly, as evident from the server logs of LM Studio and from my local prompts. For instance:

curl http://localhost:1234/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Your text string goes here",
    "model": "model-identifier-here"
  }'

However, I am uncertain about how to modify my privateGPT configuration file to accommodate this, and I am concerned that its current code may not support connecting in "openailike" mode to a Text Embeddings function supplied by LM Studio.

If time permits in the coming days, I will look into the privateGPT code to potentially integrate the connection to LM Studio for both the LLM and text embeddings models.

I appreciate any guidance on how to address this issue.

Thank you!

NetSpirit oHm · Answer 3 · Sun May 26 2024 00:44:35 GMT+0800 (China Standard Time)

Hello everyone,

I wanted to provide an update on my issue regarding the integration of LM Studio's local server for LLM models and text embeddings with PrivateGPT.

Initially, I faced difficulties in making PrivateGPT communicate with the LM Studio server for the embedding model, as the current code of privateGPT did not support this setup. I found an alternative solution.

I utilized the huggingface_hub Python library to download the text embedding models locally. This approach aligns with PrivateGPT's current functionalities and enabled me to implement the RAG (Retrieval-Augmented Generation) feature successfully.

Here are the steps I followed:

1. Install huggingface_hub Library: I installed the huggingface_hub library using pip:
pip install huggingface_hub

2. Login to Hugging Face: I logged in to my Hugging Face account using the CLI:
huggingface-cli login

I entered the token I created on the Hugging Face website when prompted.

Run PrivateGPT Setup: I used the commands provided by PrivateGPT to populate the local directory with the embedding models. This step is part of the normal setup process for PrivateGPT:
poetry run python scripts/setup

After these steps, everything worked seamlessly, and I was able to run PrivateGPT with the desired setup. It turned out to be the standard procedure described in PrivateGPT's documentation, which I initially misunderstood.

I hope this update helps others who might be facing similar issues with LM Studio. If there are any further questions or if anyone needs more details on the implementation, feel free to reach out.

Thank you!