RedisMemory support

Question

RedisMemory support

slorello89 opened this issue 10 months ago · comments

👋 Hi there, I'm Steve from Redis.

I wanted to check to make sure you all would accept a PR adding Redis as one of the MemoryStorage services? Our Vector API should be compatible with IMemoryDb, with the one caveat that getting tags working right might be a bit tricky, Redis requires you to specify which fields are indexed when creating the index time so for:

Task CreateIndexAsync(string index, int vectorSize, CancellationToken cancellationToken = default);

I can certainly get the vectors indexed easily, but non-vector fields might need to be worked in via configuration.

Anyway, let me know what you all think.

Devis Lucato · Answer 1 · Wed Dec 13 2023 12:07:23 GMT+0800 (China Standard Time)

Hi @slorello89 we surely would love having Redis support.

There are a couple of approaches, we could host the code here, though you would have to go through our processes, or you could develop a nuget in a separate repo, and we would add it to the service as one option. We can also start with a branch here and later move the code to a repo of yours.

Here's an example about what a separate repo would look like:

Postgres (stub repo, showing basic structure): https://repos.opensource.microsoft.com/orgs/microsoft/repos/kernel-memory-postgres
ElasticSearch: https://github.com/freemindlabsinc/FreeMindLabs.SemanticKernel

with the one caveat that getting tags working right might be a bit tricky, Redis requires you to specify which fields are indexed when creating the index time

I'm not too familiar with this, but I hope we can get a serialization format that allows both indexing and search :-)

Steve Lorello · Answer 2 · Wed Dec 13 2023 22:38:00 GMT+0800 (China Standard Time)

Hi @dluc - sounds good to me. Whatever is easier for you all I'll probably just start with a PR here and if it make sense I can always kick it out to another repo/package.

So wrt indexing. The trick is that Redis (traditionally a pure key-value store) can only search on values in a defined secondary index. So in our case, we'd have to set whatever tags you want to be able to search on in here. My initial thought (short of adding an extra API) would be to define the tags you want to index in the Configuration of the service?

Devis Lucato · Answer 3 · Thu Dec 14 2023 04:35:16 GMT+0800 (China Standard Time)

My initial thought (short of adding an extra API) would be to define the tags you want to index in the Configuration of the service?

Seems reasonable. So the Redis class will receive a configuration object containing the list of tags, and the user is responsible for setting up the right configuration. There are some known tags that we'll need to support by default, see the "reserved" tags (e.g. check Constants.cs)

Steve Lorello · Answer 4 · Thu Dec 14 2023 04:44:20 GMT+0800 (China Standard Time)

Exactly, I mean it's a tiny bit hacky, but short of adding a new API (which would break the other clients) idk how else you'd get there.

Question about MemoryRecord.Payload - is it safe to assume that this can be safely serialized into a JSON string? I see it's being done elsewhere but naturally, if you serialize anything (except really primitives), it'll be difficult for the resultant MemoryRecord to maintain the symmetrical type information between the storage and requester.

Devis Lucato · Answer 5 · Thu Dec 14 2023 05:59:19 GMT+0800 (China Standard Time)

Question about MemoryRecord.Payload - is it safe to assume that this can be safely serialized into a JSON string?

I'd say so yes, and we don't search inside the payload. Depending on the engine the content might not be readable, but I wouldn't worry

Devis Lucato · Answer 6 · Mon Dec 25 2023 03:51:12 GMT+0800 (China Standard Time)

@slorello89 FYI, I just finished integrating Postgres, and I think external repos might make the integration too expensive to maintain, because the Abstractions DLL (the only dependency used) changes often, and all these external repos would have to keep updating even for patches, to avoid compilation warnings. One option would be to version the Abstractions library separately, but we're not ready to do that yet.

I was thinking we could test this approach, see if it helps collaborations:

create in this repo a folder dedicated to storage adapters, e.g. Qdrant, Postgres, Redis, etc.
each storage adapter has its own maintainer, responsible for development/fixes/major upgrades, sending PRs that we merge - the PR must change only code in this folder
as KM maintainers, we take care of minor upgrades across the entire repo, e.g. upgrading a the Abstractions nuget version, maybe also fixing simple build warnings/code style reported by R#.

Weihan Li · Answer 7 · Mon Dec 25 2023 14:08:30 GMT+0800 (China Standard Time)

@dluc wondering if it's possible to use the semantic-kernel memory connectors so that we could avoid creating another serial of memory connectors for kernel memory and make most use of semantic-kernel

Devis Lucato (#2) · Answer 8 · Mon Dec 25 2023 17:18:52 GMT+0800 (China Standard Time)

@WeihanLi the SK connectors are limited and missing important features such as filtering, hybrid search, custom schemas, plus they are very expensive to maintain because they need to be written in 3 languages (C#, Python and Java). The connectors under development here will eventually replace those (if one wants), reducing the cost because KM comes as a service and offers a plugin, bringing the extra features of RAG and memory management.

Steve Lorello · Answer 9 · Thu Dec 28 2023 21:37:00 GMT+0800 (China Standard Time)

Hi @dluc - I generally agree with you that it wouldn't make sense to kick such an integration out of this repository. I get the sense that things in SK/KM land are quite dynamic at the moment, and colocating the connectors in this repository and tying them into the same CI should prevent connectors from breaking whenever something changes in Abstractions. The PR I opened initially has the connector written into "core" (a bit odd, but that seemed to be the trend with the other datasources), I'll kick it out to a seperate project.

Steve Lorello · Answer 10 · Thu Dec 28 2023 23:17:26 GMT+0800 (China Standard Time)

@dluc - moved it out of core and into a seperate project under adapters/Redis/ given your comment above - not sure exactly what you want the directory structure to be.

Devis Lucato · Answer 11 · Fri Dec 29 2023 12:00:55 GMT+0800 (China Standard Time)

repo updated, there's an extensions folder that should be easy to follow to add new DBs and other things. I also added an empty Redis placeholder if you'd like to work on it