hoytech / strfry

a nostr relay

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Probabilistic filters to reduce memory footprint of ReqWorker for monitoring

Giszmo opened this issue · comments

Clients with big filters can demand a lot of RAM on the relay. This can be reduced significantly by using probabilistic filters. While these filters are not great to query a database, they are great to check individual events against.

I had implemented this idea using Cuckoo filters here. While the REQ is received as normal JSON filter object, this is being converted into a probabilistic filter after EOSE if for example filtering for more than 10 pubkeys. The probabilistic filter is set to a false positive rate of 1/10.000.000 with the client is filtering out those false positives anyway.

It's a cool idea! I haven't yet heard anyone complaining about memory consumption from large filters, but it could be nobody is looking at that, and/or it isn't a problem yet but could become one in the future (as contact lists get bigger, etc).

I'd also like to see the analysis about the various overheads with different false positive ratios. I did some back of the envelope estimates using bloom filters for a similar user-case but the numbers didn't seem that promising to me at the time.

I think it would be great to put this through the NIP process too, since I expect a lot of people will have interesting insights on this!

I thought about nipping it but in the context of network improvement - send a probabilistic filter instead of the full list which turned out to be really not worth it as in nostr such optimisations are not appreciated at this point.

I consider the FP rate acceptable and within spec so I consider it a valid optimization of relays.

As nips are about standardizing for the sake of interoperability, I don't see the point of a nip-probabilisitc-filtering-after-eose. There is really no need for the client to know if you use Bloom or Cuckoo filters or how exactly it would look serialized.