run-llama / LlamaIndexTS

LlamaIndex in TypeScript

Home Page:https://ts.llamaindex.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for Pinecone namespaces

eburnette opened this issue · comments

Pinecone takes a namespace (PINECONE_NAME_SPACE) that can either be empty (default) or a string. I didn't see a way in llamaindex.ts to pass that in when inserting or querying, and setting the environment variable had no effect, so it's currently always using the default namespace.

Documentation: https://docs.pinecone.io/docs/namespaces

so a workaround could be:

    const pcvs = new PineconeVectorStore({ indexName: 'my-index' });

    const ctx = serviceContextFromDefaults();
    const index = await VectorStoreIndex.fromVectorStore(pcvs, ctx);

    // list all namespaces (the default one is '')
    await index.describeIndexStats();
    // Response: {
    //   namespaces: {
    //     '': { recordCount: 10 }
    //     foo: { recordCount: 2000 },
    //     bar: { recordCount: 2000 }
    //   },
    //   dimension: 1536,
    //   indexFullness: 0,
    //   totalRecordCount: 4010
    // }

    // Add a namespace arbitray named "ns-1" to our index
    index.namespace('ns-1');

    console.log(index.namespace('ns-1').target); // { index: 'my-index', namespace: 'ns-1', indexHostUrl: undefined }

    // Now perform index operations in the targeted index and namespace
    await index.fetch(['3']);

    await index
      .namespace('ns-1')
      .query({ topK: 3, vector: [0.22, 0.66] });
    // {
    //   matches: [
    //     {
    //       id: '556',
    //       score: 1.00000012,
    //       values: [],
    //       sparseValues: undefined,
    //       metadata: undefined
    //     },
    //     {
    //       id: '137',
    //       score: 1.00000012,
    //       values: [],
    //       sparseValues: undefined,
    //       metadata: undefined
    //     },
    //     {
    //       id: '129',
    //       score: 1.00000012,
    //       values: [],
    //       sparseValues: undefined,
    //       metadata: undefined
    //     }
    //   ],
    //   namespace: 'ns-1',
    //   usage: {
    //     readUnits: 5
    //   }
    // }

Should be necessary to improve the queryEngine params to accept the concept of namespaces? @marcusschiesser

I also need this solution :) I'm currently using Pinecone's index object directly:

const embedModel = new OpenAIEmbedding({
  apiKey: process.env.OPENAI_API_KEY as string,
  model: "text-embedding-3-large",
  dimensions: 768,
});
const vector = await embedModel.getTextEmbedding("Hello world");

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY as string });
const response = await pc
  .index("my-index")
  .namespace("my-namespace")
  .query({ topK: 5, vector, includeMetadata: true });

I think we can easily add a namespace parameter to the PineconeVectorStore, then you can create an index per namespace, like this:

const ctx = serviceContextFromDefaults();
const pcvs1 = new PineconeVectorStore({ indexName: 'my-index', namespace: 'ns-1' });
const index1 = await VectorStoreIndex.fromVectorStore(pcvs1, ctx);
const pcvs2 = new PineconeVectorStore({ indexName: 'my-index', namespace: 'ns-2' });
const index2 = await VectorStoreIndex.fromVectorStore(pcvs2, ctx);

This feature is implemented in the new 0.1.19 release. Please give it a try!