google / zoekt

I want to investigate reducing the memory usage of zoekt-webserver when there are many shards. I haven't investigated this properly yet, but have noticed OOMs and I am wondering if there are some wins here.

Ideas before profiling:

More bounded concurrency serving a request.
Profile per request memory usage.
Use fs based datastructure for map[ngram].
Use simpleSection for fileName ngrams map value.
Use fs format for more fields in indexData.

I suspect the working set of ngrams over a time period is relatively small compared to the number of unique ngrams in a shard. So the performance implications of an fs based datastructure for map[ngram] may be minimal.

@hanwen if you have any ideas to focus my investigation that would be great.

So the performance implications of an fs based datastructure for map[ngram] may be minimal.

Could be especially true with NVMe storage becoming more available and so quick.

There are also options for compressed posting lists. For example, https://github.com/dgryski/go-postings

@dgryski thanks for the link, I remember reading that code a while ago! (Thanks for all the interesting stuff you post btw, I'm a reader). However, the posting lists are already disk backed. It is a map[ngram]simpleSection. A simpleSection is just an offset and length into the index file.

zoekt/indexdata.go

Line 31 in dd7d981

ngrams map[ngram]simpleSection

I'd do profiling before throwing out ideas, and classify the problems more precisely. Do you get OOM for queries with many results? For literal substring search or regex matching? Are the queries complex over few shards or simple over many shards? The content and posting lists are mmap'ed , so they are effectively already FS-based. Is the OOM because it faults in too many mmap'd pages, or because the search generates too much intermediate data ?

re. FS based data structures, what does that mean? FS entities (dirnames, inodes) are very large compared to the data (lots of int32s) in the index, so moving them to FS will almost certainly increase overall memory use.

I'd do profiling before throwing out ideas, and classify the problems more precisely.

Fair enough! I thought I'd do this just in case it triggered some ideas you had already thought about.

Do you get OOM for queries with many results? For literal substring search or regex matching? Are the queries complex over few shards or simple over many shards?

I don't know! It would be over many shards though. The instance had 20k+ repos indexed with a machine with 80G ram. As I said, I haven't actually dived into this yet just wanted to pre-empt discussion.

re. FS based data structures, what does that mean? FS entities (dirnames, inodes) are very large compared to the data (lots of int32s) in the index, so moving them to FS will almost certainly increase overall memory use.

Sorry FS based is misleading. I meant that instead of unmarshalling the whole map[ngram]sectionData into memory, you have some datastructure that operates over IndexFile. Traditionally I'd suppose this would be some sort of btree.

Apologies if I prematurely filed this issue. Was just showing intent to work on this.

remember that for each shard, the map[ngram] might be small, so a btree might well create unnecessary
overhead.

One random idea: you could use uint32 iso. uint64 for shards that are pure ASCII; or maybe have a map[uint32] and map[uint64] for unicode vs ascii trigrams.

Is this with your RPC-powered server? If so, it will have to copy around the content for the matches. If you can avoid (de)serializing data you'd avoid a lot of garbage.

Try to see if you can repro the problem with the plain zoekt webserver or if it is a sourcegraph specific problem.

Apologies

No worries. I'm curious to see what you come up with.

FYI I did some memory profiling on an instance with 38k shards. Took a heap profile after startup, but no search queries had run yet. Was using about 40gb, the biggest offender by far was the trigram map.

remember that for each shard, the map[ngram] might be small, so a btree might well create unnecessary overhead.

Yeah my suspicion is the long tail of shards are small. So indeed the overall performance may degrade quite a bit if we naively switch to btrees.

One random idea: you could use uint32 iso. uint64 for shards that are pure ASCII; or maybe have a map[uint32] and map[uint64] for unicode vs ascii trigrams.

This is a neat idea. I guess that is a roughly 1/4 memory saving (the values in the map are two 32bit uints). Could even do this without changing the index format, which makes it quite attractive.

Is this with your RPC-powered server? If so, it will have to copy around the content for the matches. If you can avoid (de)serializing data you'd avoid a lot of garbage.

Indeed this is. We did notice a bug recently were we set max matches far too high, which would of led to per request high usage. However, the RPC layer is actually quite efficient. gob is nice :) So I would be surprised outside of us being silly with the search options if this was sg specific.

There are also options for compressed posting lists. For example, https://github.com/dgryski/go-postings

@dgryski thanks again for this. The approach zoekt uses is already varint encoding on deltas (just not group varint). However, I've been experimenting with dividing the posting lists into blocks (like go-postings) to speed up hitIterator.next(limit uint32) (called advance in your code).

A tunable we can look into is setting GOGC=50. The on-heap data set is relatively stable, isn't it? With such a large heap, the default GOGC=100 requiring double the amount (100%) of the lowest heap size the process has seen, seems too much. Halving that value means the GC will kick in when the heap grows by 50% of that baseline, so we'd need far less RAM.

Reduce memory usage of zoekt-webserver