openai / chatgpt-retrieval-plugin

The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`services.openai.get_embeddings` does not expose the `dimensions` kwarg of `openai.Embedding.create`

caseyclements opened this issue · comments

Although EMBEDDING_DIMENSION is described as a required variable in the README , it is not used, except in a few of the datastore's setup.md instructions to create their vector indexes.

The README goes on to say "The plugin uses OpenAI's embeddings model (text-embedding-3-large 256 dimension embeddings by default)", but len(get_embeddings(["Some input text"])[0]) == 3072.

The reason that this is urgent to me is that I am soon to submit a PR to add MongoDB's Atlas as a new datastore. And though Atlas' imminent next release will increase support to a dimension of 4096, previous versions have 2048. i.e. less than 3072. If I understand correctly, this means that the following line is incorrect. "For example, if your vector database supports up to 1024 dimensions, you can use text-embedding-3-large and set the dimensions API parameter to 1024."

Making changes to support this is not difficult. If agreed, I would like to submit a PR to fix this issue. (Our MongoDB datastore addition as a separate PR.)

This issue has been resolved in PR #428.