[Meta] Performance

Question

[Meta] Performance

TarikGul opened this issue 8 months ago · comments

Summary

Recently we have been posed with the question of how we can make Sidecar more performant. This can have multiple positive side affects such as lower latency, lower server costs, more responsive api, etc. As shown below, there are multiple avenues all of which can contribute to the performance of sidecar, some internal, and some external.

NOTE: everything I cover below will be in reference to the /blocks/{blockId} endpoint. A majority of the other endpoints are very performant and don't have the same issues as the blocks endpoint does.

My Environment

Macbook pro 16inch M2 32GB memory.

I run sidecar with --inspect in order to profile and inspect performance.
Also running an archive node locally.

External

Subway

Currently in beta, this library acts as an RPC middleware proxy that can cache RPC calls to the node. I did load tests on sidecar for 60s with, and without subway to demonstrate the affects. An important observation though is that I am running an archive node locally, which doesn't show the full power of Subway which when having to query an external archive node, it really shows the benefits of the cache.

Without:

539 requests in 1.00m, 569.18MB read
Requests/sec:      8.97
Transfer/sec:      9.47MB
--------------------------
Total completed requests:       	539
Failed requests:                	0
Timeouts:                       	0
Avg RequestTime(Latency):          475.98ms
Max RequestTime(Latency):          2722.973ms
Min RequestTime(Latency):          0.622ms

With:

619 requests in 1.00m, 631.96MB read
Requests/sec:     10.30
Transfer/sec:     10.52MB
--------------------------
Total completed requests:       	619
Failed requests:                	0
Timeouts:                       	0
Avg RequestTime(Latency):          449.25ms
Max RequestTime(Latency):          3353.909ms
Min RequestTime(Latency):          0.406ms

Overall we can see a substantial benefit from the RPC cache layer, which is increasing the output from sidecar.

Local RPC vs non-local RPC node

External RPC node: /blocks/7753833 1477ms <- Hosted in Germany
Internal RPC node /blocks/7753833 479ms <- Hosted on my local machine

This is pretty obvious, therefore I didn't go to deep into this but having your server closer to your rpc node helps lower latency.

Internal

Lets talk about RPC requests :). Currently, a lot of the overhead for RPC calls comes from non batched calls (calls that aren't using Promise.all). When using inspect and you profile requests you can see how the server is in idle waiting for responses in order to continue its operations.

No Fee's

Calculating Fee's has a few fundamental problems.

We call Promises inside of a loop.
We can't batch those promises.

Therefore, when I added an option for /blocks/{blockId}?noFee=true, our performance went dramatically up:

With noFees:

1999 requests in 1.00m, 2.12GB read
Requests/sec:     33.26
Transfer/sec:     36.05MB
--------------------------
Total completed requests:       	1999
Failed requests:                	0
Timeouts:                       	0
Avg RequestTime(Latency):          146.46ms
Max RequestTime(Latency):          1540.528ms
Min RequestTime(Latency):          0.462ms

Without noFees:

447 requests in 1.00m, 521.99MB read
Requests/sec:      7.44
Transfer/sec:      8.69MB
--------------------------
Total completed requests:       	447
Failed requests:                	0
Timeouts:                       	0
Avg RequestTime(Latency):          576.22ms
Max RequestTime(Latency):          2895.802ms
Min RequestTime(Latency):          2.93ms

The fees section needs to be optimized with an algorithm that iterates through the extrinsics creates a map that has the necessary Promises to batch to get the Fee information then call the Promises, and then apply the results to their correct extrinsics.

Getting Hash of a block number

Before each call to /blocks/{blockId} if the passed in blockId is a number we fetch its corresponding blockHash. If a blockHash is passed in there will be a small improvement in performance, an improvement nonetheless. BUT, for fees we also need to get the previous blockHash which means if the user passes in a number, we can actually batch those 2 calls together because we would just need to subtract the blockId by 1.

Similarly for /blocks/head we make an RPC call in the controller to retrieve the header. Can this also be batched somewhere?

Finalization

A small improvement for performance can also be set in the controller config with the finalizes field. If set to false, this will reduce the amount of calls by 1, which is one less idle period for the server. We should extend an override query param that will set finalize to false therefore saving a call.

Overall

It's quite clear that the use of promises individually has an impact on server and endpoint performance in sidecar. Most of all calculating fee's. I think these finding should be used to optimize our process for querying a blocks information, but also as motivation to make docs on how to increase the speed sidecar's endpoints using available query params and or external tooling.

Edit: I will add more finding's and measurements below as they are available.

Polkadot Forum · Answer 1 · Wed Nov 29 2023 23:51:26 GMT+0800 (China Standard Time)

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/scaling-down-of-parity-s-public-infrastructure/4697/6