Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations

Home Page:https://futurehouse.gitbook.io/futurehouse-cookbook

Repository from Github https://github.comFuture-House/paper-qaRepository from Github https://github.comFuture-House/paper-qa

PDF Parsing Still Calls OpenAI API Despite Using Deepseek-Chat as LLM

huangwei021230 opened this issue · comments

Description:

I am using paper-qa with deepseek-chat as the primary language model (LLM). However, during the PDF parsing step, the code still attempts to call the OpenAI API, which results in an error. I have configured the llm_config to use deepseek-chat for all LLM-related tasks, but it seems that the PDF parsing step is not respecting this configuration.

Code:

model = "deepseek/deepseek-chat"
llm_config={
    "model_list": [{
        "model_name": model,
        "litellm_params": {
            "model": model,
            "api_key": API_KEY
        }
    }]
}

try:
    answer_response = ask(
        query="How does RAG (Retrieval Augmented Generation) work",
        settings=Settings(
            llm=model,
            summary_llm=model,
            agent=AgentSettings(agent_llm=model, agent_llm_config=llm_config, index_concurrency=2),
            llm_config=llm_config,
            summary_llm_config=llm_config,
            paper_directory=".",
            verbosity=3,
        ),
    )
    print(answer_response)
except Exception as e:
    print(f"An error occurred: {e}")
    traceback.print_exc()

Error:

The error occurs during the PDF parsing step, where the code attempts to call the OpenAI API for embeddings, despite the configuration specifying deepseek-chat as the model.

16:21:52] SEMANTIC_SCHOLAR_API_KEY environment variable not set. Semantic Scholar API rate limits may apply.                                                  
           CROSSREF_MAILTO environment variable not set. Crossref API rate limits may apply.                                                                   
           CROSSREF_API_KEY environment variable not set. Crossref API rate limits may apply.                                                                  
[16:21:53] Metadata not found for From Local to Global: A GraphRAG Approach to Query-Focused Summarization in SemanticScholarProvider.                         
[16:21:56] Metadata not found for From Local to Global: A GraphRAG Approach to Query-Focused Summarization in CrossrefProvider.                                
[16:21:57] Error parsing 2404.16130.pdf, skipping index for this file.                                                                                         
           ╭─────────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────────╮
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/litellm/llms/openai/openai.py:1084 in aembedding                       │
           │                                                                                                                                                  │
           │   1081 │   │   │   │   max_retries=max_retries,                                                                                                  │
           │   1082 │   │   │   │   client=client,                                                                                                            │
           │   1083 │   │   │   )                                                                                                                             │
           │ ❱ 1084 │   │   │   headers, response = await self.make_openai_embedding_request(                                                                 │
           │   1085 │   │   │   │   openai_aclient=openai_aclient,                                                                                            │
           │   1086 │   │   │   │   data=data,                                                                                                                │
           │   1087 │   │   │   │   timeout=timeout,                                                                                                          │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/litellm/litellm_core_utils/logging_utils.py:131 in async_wrapper       │
           │                                                                                                                                                  │
           │   128 │   │   async def async_wrapper(*args, **kwargs):                                                                                          │
           │   129 │   │   │   start_time = datetime.now()                                                                                                    │
           │   130 │   │   │   try:                                                                                                                           │
           │ ❱ 131 │   │   │   │   result = await func(*args, **kwargs)                                                                                       │
           │   132 │   │   │   │   return result                                                                                                              │
           │   133 │   │   │   finally:                                                                                                                       │
           │   134 │   │   │   │   end_time = datetime.now()                                                                                                  │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/litellm/llms/openai/openai.py:1037 in make_openai_embedding_request    │
           │                                                                                                                                                  │
           │   1034 │   │   │   response = raw_response.parse()                                                                                               │
           │   1035 │   │   │   return headers, response                                                                                                      │
           │   1036 │   │   except Exception as e:                                                                                                            │
           │ ❱ 1037 │   │   │   raise e                                                                                                                       │
           │   1038 │                                                                                                                                         │
           │   1039 │   @track_llm_api_timing()                                                                                                               │
           │   1040 │   def make_sync_openai_embedding_request(                                                                                               │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/litellm/llms/openai/openai.py:1030 in make_openai_embedding_request    │
           │                                                                                                                                                  │
           │   1027 │   │   - call embeddings.create by default                                                                                               │
           │   1028 │   │   """                                                                                                                               │
           │   1029 │   │   try:                                                                                                                              │
           │ ❱ 1030 │   │   │   raw_response = await openai_aclient.embeddings.with_raw_response.create(                                                      │
           │   1031 │   │   │   │   **data, timeout=timeout                                                                                                   │
           │   1032 │   │   │   )  # type: ignore                                                                                                             │
           │   1033 │   │   │   headers = dict(raw_response.headers)                                                                                          │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/_legacy_response.py:381 in wrapped                              │
           │                                                                                                                                                  │
           │   378 │   │                                                                                                                                      │
           │   379 │   │   kwargs["extra_headers"] = extra_headers                                                                                            │
           │   380 │   │                                                                                                                                      │
           │ ❱ 381 │   │   return cast(LegacyAPIResponse[R], await func(*args, **kwargs))                                                                     │
           │   382 │                                                                                                                                          │
           │   383 │   return wrapped                                                                                                                         │
           │   384                                                                                                                                            │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/resources/embeddings.py:243 in create                           │
           │                                                                                                                                                  │
           │   240 │   │   │                                                                                                                                  │
           │   241 │   │   │   return obj                                                                                                                     │
           │   242 │   │                                                                                                                                      │
           │ ❱ 243 │   │   return await self._post(                                                                                                           │
           │   244 │   │   │   "/embeddings",                                                                                                                 │
           │   245 │   │   │   body=maybe_transform(params, embedding_create_params.EmbeddingCreateParams),                                                   │
           │   246 │   │   │   options=make_request_options(                                                                                                  │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/_base_client.py:1856 in post                                    │
           │                                                                                                                                                  │
           │   1853 │   │   opts = FinalRequestOptions.construct(                                                                                             │
           │   1854 │   │   │   method="post", url=path, json_data=body, files=await                                                                          │
           │        async_to_httpx_files(files), **options                                                                                                    │
           │   1855 │   │   )                                                                                                                                 │
           │ ❱ 1856 │   │   return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)                                                    │
           │   1857 │                                                                                                                                         │
           │   1858 │   async def patch(                                                                                                                      │
           │   1859 │   │   self,                                                                                                                             │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/_base_client.py:1550 in request                                 │
           │                                                                                                                                                  │
           │   1547 │   │   else:                                                                                                                             │
           │   1548 │   │   │   retries_taken = 0                                                                                                             │
           │   1549 │   │                                                                                                                                     │
           │ ❱ 1550 │   │   return await self._request(                                                                                                       │
           │   1551 │   │   │   cast_to=cast_to,                                                                                                              │
           │   1552 │   │   │   options=options,                                                                                                              │
           │   1553 │   │   │   stream=stream,                                                                                                                │
           │                                                                                                                                                  │
           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/_base_client.py:1651 in _request                                │
           │                                                                                                                                                  │
           │   1648 │   │   │   │   await err.response.aread()                                                                                                │
           │   1649 │   │   │                                                                                                                                 │
           │   1650 │   │   │   log.debug("Re-raising status error")                                                                                          │
           │ ❱ 1651 │   │   │   raise self._make_status_error_from_response(err.response) from None                                                           │
           │   1652 │   │                                                                                                                                     │
           │   1653 │   │   return await self._process_response(                                                                                              │
           │   1654 │   │   │   cast_to=cast_to,                                                                                                              │
           ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Expected Behavior:

The PDF parsing step should use the deepseek-chat model for embeddings, as specified in the llm_config.

Actual Behavior:

The PDF parsing step attempts to call the OpenAI API, resulting in an error.

Additional Information:

  • The error suggests that the embedding step is still using OpenAI's API, even though deepseek-chat is configured as the primary model.
  • It seems that the embedding step is not respecting the llm_config settings.

Questions:

  1. Is there a specific configuration or setting that needs to be adjusted to ensure that the PDF parsing step uses deepseek-chat for embeddings?

Thank you for your assistance!

Hey @huangwei021230! I'm here to help you out with any issues you're facing. If you need assistance with bug fixes, have questions, or want to contribute, feel free to ask!

The issue you're encountering seems to be related to the default settings in paper-qa that might be causing it to use OpenAI's API for embeddings, despite your configuration to use deepseek-chat. By default, paper-qa uses OpenAI's text-embedding-3-small model for embeddings if no specific model is set [1].

To ensure that deepseek-chat is used for embeddings, you need to explicitly set the embedding parameter in the Settings object to use a model supported by deepseek-chat. The embedding_model_factory function in paper-qa is responsible for creating the appropriate EmbeddingModel based on the provided embedding string [2].

Here's what you can do:

  1. Specify the Embedding Model: Ensure that the embedding parameter in your Settings is set to a model that deepseek-chat supports. If deepseek-chat does not directly support embeddings, you might need to use a compatible model or approach.

  2. Check the embedding_model_factory: Verify that the embedding_model_factory is correctly configured to create an EmbeddingModel that aligns with your deepseek-chat setup.

  3. Update the Configuration: Modify your Settings to include the correct embedding model. For example:

    settings = Settings(
        llm=model,
        summary_llm=model,
        agent=AgentSettings(agent_llm=model, agent_llm_config=llm_config, index_concurrency=2),
        llm_config=llm_config,
        summary_llm_config=llm_config,
        embedding="your-embedding-model",  # Specify the correct embedding model here
        paper_directory=".",
        verbosity=3,
    )
  4. Verify Model Support: Ensure that deepseek-chat or the specified embedding model is supported by litellm and is correctly configured in your environment.

By explicitly setting the embedding model in your configuration, you should be able to avoid the fallback to OpenAI's API. If deepseek-chat does not support embeddings directly, you may need to use a different model for embeddings that is compatible with your setup.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

In paperqa's ask method, there are four models that need to be configured: llm, summary_llm, agent_llm (within the agent configuration), and the embedding parameter. The first three are for specifying general large language models, while the last one, embedding, is for the model used to generate embeddings.

By default, paperqa uses OpenAI for all four of these models. It seems you have configured only the first three – llm, summary_llm, and agent_llm – but have not set the embedding parameter. This is likely the reason for the error you are encountering.

DeepSeek's API does not currently offer embedding models. Therefore, to resolve this, you will need to specify an embedding model from a different provider for the embedding parameter in paperqa.

Hello @huangwei021230 and @lingyaochu .

Thanks, @lingyaochu , for the support here. You are correct. paper-qa defaults to openai models in these four attributes. @huangwei021230 correctly switched the llms, but paper-qa is still using text-embedding-3-small for embedding. As can be seen here:

           │ /Users/huangwei021230/anaconda3/envs/paperqa/lib/python3.11/site-packages/openai/resources/embeddings.py:243 in create                           │
           │                                                                                                                                                  │
           │   240 │   │   │                                                                                                                                  │
           │   241 │   │   │   return obj                                                                                                                     │
           │   242 │   │                                                                                                                                      │
           │ ❱ 243 │   │   return await self._post(                                                                                                           │
           │   244 │   │   │   "/embeddings",                                                                                                                 │
           │   245 │   │   │   body=maybe_transform(params, embedding_create_params.EmbeddingCreateParams),                                                   │
           │   246 │   │   │   options=make_request_options( 

I'd suggest giving a look into the embeddings section of the readme. The embeddings are used to encode the papers in the vector store and to search for similarity.