Perf issue during fanout
jyotimahapatra opened this issue · comments
We use a map for storing the requests in cache https://github.com/envoyproxy/xds-relay/blob/master/internal/app/cache/cache.go#L52 which is key'ed on the Discovery Request. As a result, each entry in the map is going to be a unique entry and addition of deletion of unique entries is going to cause a memory overload on the map. It is a known issue in golang maps. (here, here)
In order to prove the hypothesis i replicated the benchmark tests to insert increasing number of DiscoveryRequests and remove them. This simulates the fanout scenario (here). We can see that even if the eventual state in the cache is 1 entry, addition and deletion of increasing amount of map entries causes high degree of processing time.
Benchmarking code: #196
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=1 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 721880 1509 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=10 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 809854 1473 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=100 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 658707 1641 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=1000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 264152 4144 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=10000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 50784 24675 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=100000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 5220 222593 ns/op 944 B/op 12 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=1000000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/cache -bench "^(BenchmarkCacheRetrieval)$" | grep ns/op
BenchmarkCacheRetrieval-8 255 4825196 ns/op 944 B/op 12 allocs/op
In a separate benchmark test #198 from orchestrator perspective, we got similar results.
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=1 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 69771 16503 ns/op 9408 B/op 93 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=10 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 64796 16518 ns/op 9408 B/op 93 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=100 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 68280 18062 ns/op 9408 B/op 93 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=1000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 50516 23984 ns/op 9408 B/op 93 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=10000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 28072 41137 ns/op 9408 B/op 93 allocs/op
➜ xds-relay git:(master) ✗ export MAX_DISCOVERY_REQUESTS=100000 && go test -benchmem -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
BenchmarkGoldenPath-8 4819 236752 ns/op 9426 B/op 93 allocs/op
I updated gcp and removed usage of maps from cache and downstream. #204
Benchmark looks like this:
➜ xds-relay git:(benchnomap) export MAX_DISCOVERY_REQUESTS=1 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
99036 11910 ns/op
➜ xds-relay git:(benchnomap) ✗ export MAX_DISCOVERY_REQUESTS=10 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
92912 12383 ns/op
➜ xds-relay git:(benchnomap) ✗ export MAX_DISCOVERY_REQUESTS=100 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
94479 12682 ns/op
➜ xds-relay git:(benchnomap) ✗ export MAX_DISCOVERY_REQUESTS=1000 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
106798 12866 ns/op
➜ xds-relay git:(benchnomap) ✗ export MAX_DISCOVERY_REQUESTS=10000 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
94300 12878 ns/op
➜ xds-relay git:(benchnomap) ✗ export MAX_DISCOVERY_REQUESTS=100000 && go test -run=^$ github.com/envoyproxy/xds-relay/internal/app/orchestrator -bench "^(BenchmarkGoldenPath)$" | grep ns
12646 86420 ns/op
This is an improvement from the current implementation.