OpenAI can't get default_dimensions, when I specify a special embedding_model.

Question

OpenAI can't get default_dimensions, when I specify a special embedding_model.

lukefan opened this issue 2 months ago · comments

Many services are now compatible with the OpenAI API, including Ollama.
So when I use OpenAI as a large model, I hope to specify a special embedding_model.
You can refer to the method used in Ollama and directly do a simple embedding to confirm the length of the vector output by the current embedding model.
just like: lib/langchain/llm/ollama.rb

Andrei Bondarev · Answer 1 · Mon Apr 29 2024 21:58:58 GMT+0800 (China Standard Time)

@lukefan Could you please show me the code and the error you're seeing?

Luke Fan · Answer 2 · Mon Apr 29 2024 22:14:27 GMT+0800 (China Standard Time)

config/initializers/langchainrb_rails.rb:

LangchainrbRails.configure do |config|
  config.vectorsearch = Langchain::Vectorsearch::Pgvector.new(
    llm: Langchain::LLM::OpenAI.new(
      api_key: 'ollama',
      llm_options: {uri_base:'http://localhost:11434'},
      default_options: {
        embeddings_model_name: 'chevalblanc/dmeta-embedding-zh:latest',
        n: 1,
        temperature: 0.8,
        chat_completion_model_name: 'Llama3-8B-Chinese-Chat:latest',
        },
      )
  )
end

so I got this error:
key not found: "chevalblanc/dmeta-embedding-zh:latest"

Andrei Bondarev · Answer 3 · Mon Apr 29 2024 22:18:35 GMT+0800 (China Standard Time)

@lukefan I don't see this model on Ollama: https://ollama.com/search?q=chevalblanc&p=1

Andrei Bondarev · Answer 4 · Mon Apr 29 2024 22:20:09 GMT+0800 (China Standard Time)

But if you wanted to use chevalblanc/embedding model for example. Make sure you've pulled the model down:

ollama pull chevalblanc/embedding
pulling manifest
pulling 7c43e3a2e21a... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 651 MB
pulling 4964a5df96b1... 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  260 B
verifying sha256 digest
writing manifest
removing any unused layers
success

and then use the Ollama LLM:

llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: { embeddings_model_name: "chevalblanc/embedding"})

llm.embed text: "..."
=>
#<Langchain::LLM::OllamaResponse:0x00000001253740d0
 @model="chevalblanc/embedding",
 @prompt_tokens=nil,
 @raw_response=
  {"embedding"=>
    [-0.40095722675323486,
     -0.019067473709583282,
     -0.2779462933540344,
     ...

Luke Fan · Answer 5 · Mon Apr 29 2024 22:21:26 GMT+0800 (China Standard Time)

https://ollama.com/milkey/dmeta-embedding-zh
they change url.
There are many services that are compatible with the OpenAI API, such as together.ai or groq.com, I just used ollama as an example.

lib/langchain/llm/ollama.rb The situation where embedding_model is not found in EMBEDDING_SIZES is handled.
This is hard-coded in OpenAI.

Andrei Bondarev · Answer 6 · Mon Apr 29 2024 22:25:12 GMT+0800 (China Standard Time)

@lukefan Then...

ollama pull milkey/dmeta-embedding-zh:f16 # of :f32

llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: { embeddings_model_name: "milkey/dmeta-embedding-zh:f16"})

Luke Fan · Answer 7 · Mon Apr 29 2024 22:26:52 GMT+0800 (China Standard Time)

I want to be able to use OpenAI to call various compatible services, not just ollama.

Andrei Bondarev · Answer 8 · Mon Apr 29 2024 22:28:43 GMT+0800 (China Standard Time)

@lukefan This library is built a bit differently though. While I understand that some other LLM providers created similar interfaces to OpenAI, not all of them have the same interface: Google Gemini, Anthropic, Cohere for example.

Luke Fan · Answer 9 · Mon Apr 29 2024 22:31:43 GMT+0800 (China Standard Time)

I don't expect to use OpenAI's API to call all services.
But I will try to choose services that are compatible with OpenAI.
So I hope that OpenAI's support can consider compatibility more.
Doing so can also make your project compatible with more service platforms.

Luke Fan · Answer 10 · Wed May 01 2024 08:48:14 GMT+0800 (China Standard Time)

@andreibondarev Sorry.
There is a problem with my test. ollama does not implement embeddings compatible with OpenAI.
However, Together does implement this part of the functionality. I should use the embeddings of together.xyz for testing.

Andrei Bondarev · Answer 11 · Wed May 01 2024 19:25:14 GMT+0800 (China Standard Time)

@lukefan Right, the embeddings are different. Should we close this issue now?

Luke Fan · Answer 12 · Wed May 01 2024 22:23:36 GMT+0800 (China Standard Time)

@andreibondarev

 llm = Langchain::LLM::OpenAI.new(
      api_key: my_together_key,
      llm_options: {uri_base: 'https://api.together.xyz'},
      default_options: {
        n: 1,
        temperature: 1,
        chat_completion_model_name: "Qwen/Qwen1.5-72B",
        embeddings_model_name: "WhereIsAI/UAE-Large-V1"
      }  
    )

this code can run.

llm.chat(messages: [{role: "user", content: "What is the meaning of life?"}]).completion

this code will raise error message

llm.embed(text: "foo bar").embedding

error message:
~/.rvm/gems/ruby-3.1.4/gems/langchainrb-0.11.4/lib/langchain/utils/token_length/openai_validator.rb:73:in token_length': undefined method encode' for nil:NilClass (NoMethodError)

      encoder.encode(text).length
             ^^^^^^^

Andrei Bondarev · Answer 13 · Wed May 08 2024 07:30:55 GMT+0800 (China Standard Time)

@lukefan I think we can probably remove this max token validation: https://github.com/patterns-ai-core/langchainrb/blob/main/lib/langchain/llm/openai.rb#L76

Luke Fan · Answer 14 · Wed May 08 2024 17:08:27 GMT+0800 (China Standard Time)

Yes, this should solve the problem better.