API Sidecar request is very slow

Question

API Sidecar request is very slow

zhy827827 opened this issue 7 months ago · comments

Since this morning, API requests have been very slow, previously very fast

What causes it.

2023-12-21 05:40:08 http: GET /blocks/18689818 200 26229ms
2023-12-21 05:40:10 http: GET /blocks/18687006 200 37203ms
2023-12-21 05:40:11 http: GET /blocks/18687150 200 41863ms
2023-12-21 05:40:11 http: GET /blocks/18687153 200 42968ms
2023-12-21 05:40:15 http: GET /blocks/18687004 200 42359ms
2023-12-21 05:40:15 http: GET /blocks/18687003 200 42816ms
2023-12-21 05:40:17 http: GET /blocks/18687005 200 43683ms
2023-12-21 05:40:19 http: GET /blocks/18687007 200 45851ms
2023-12-21 05:40:19 http: GET /blocks/head/header 200 141ms
2023-12-21 05:40:23 http: GET /blocks/head/header 200 56ms
2023-12-21 05:40:23 http: GET /blocks/head/header 200 56ms
2023-12-21 05:40:29 http: GET /blocks/head/header 200 67ms
2023-12-21 05:40:36 http: GET /blocks/18687005 200 378ms
2023-12-21 05:40:37 http: GET /blocks/18687007 200 529ms
2023-12-21 05:40:38 http: GET /blocks/18687007 200 261ms
2023-12-21 05:40:38 http: GET /blocks/18687005 200 289ms
2023-12-21 05:40:38 http: GET /blocks/head 200 42737ms
2023-12-21 05:40:38 http: GET /blocks/head/header 200 87ms
2023-12-21 05:40:38 http: GET /blocks/head/header 200 84ms
2023-12-21 05:40:42 http: GET /blocks/18689819 200 31477ms
2023-12-21 05:40:42 http: GET /blocks/18687154 200 42514ms
2023-12-21 05:40:42 http: GET /blocks/head/header 200 116ms
2023-12-21 05:40:42 http: GET /blocks/head/header 200 117ms
2023-12-21 05:40:44 http: GET /blocks/head 200 43090ms
2023-12-21 05:40:45 http: GET /blocks/18687152 200 45707ms
2023-12-21 05:40:46 http: GET /blocks/18687151 200 45929ms
2023-12-21 05:40:48 http: GET /blocks/18687006 200 42921ms
2023-12-21 05:40:50 http: GET /blocks/head 200 43634ms
2023-12-21 05:40:50 http: GET /blocks/head 200 43528ms
2023-12-21 05:40:51 http: GET /blocks/18687150 200 51447ms
2023-12-21 05:40:52 http: GET /blocks/18687153 200 51995ms
2023-12-21 05:40:53 http: GET /blocks/18687004 200 48363ms
2023-12-21 05:40:54 http: GET /blocks/18687003 200 48796ms
2023-12-21 05:40:54 http: GET /blocks/head 200 50336ms
2023-12-21 05:40:54 http: GET /blocks/head 200 49980ms
2023-12-21 05:40:55 http: GET /blocks/head 200 50097ms
2023-12-21 05:40:55 http: GET /blocks/head 200 49868ms
2023-12-21 05:40:55 http: GET /blocks/18687005 200 50568ms
2023-12-21 05:40:57 http: GET /blocks/18687007 200 52017ms

polkadot version: v1.5.0
sidecar:v17.3.2

The server is configured with 12 -core CPU, 64G memory, 3T SSD

EmilKietzman · Answer 1 · Thu Dec 21 2023 15:12:14 GMT+0800 (China Standard Time)

Above 1mil transactions today, cause of DotOrdinals inscriptions

Bastian Köcher · Answer 2 · Thu Dec 21 2023 16:14:30 GMT+0800 (China Standard Time)

What polkadot api is this you are talking about?

Xiliang Chen · Answer 3 · Thu Dec 21 2023 16:17:21 GMT+0800 (China Standard Time)

this is most likely a bug in sidecar too slow to handle all the extrinsics

Iker · Answer 4 · Thu Dec 21 2023 17:18:00 GMT+0800 (China Standard Time)

thanks for reporting, indeed, this seems to be an issue within API Sidecar, we are looking into it.

Matheus Bratfisch · Answer 5 · Thu Dec 21 2023 21:18:59 GMT+0800 (China Standard Time)

We are facing the same issue, it seems right after a start the response time is already huge... but after sequential requests it becomes, much much worst.

GET /blocks/18685671 200 41512ms
GET /blocks/18685673 200 41645ms
GET /blocks/18685637 200 132598ms
GET /blocks/18685670 200 43180ms
GET /blocks/18685660 200 76406ms
GET /blocks/18685644 200 78103ms
GET /blocks/18685654 200 51118ms
GET /blocks/18685637 200 112040ms

We tried increasing a lot our cpu/memory and also --max-old-space-size but it doesn't seem to improve that much.

Alberto Nicolas Penayo · Answer 6 · Fri Dec 22 2023 03:41:17 GMT+0800 (China Standard Time)

This was addressed by our latest release.
This release focuses on improving the performance of the tool resolving a regression where blocks were overwhelmed with transactions. The noFees query parameter focuses on removing fee info for the blocks if the user does not intend on needing fees. For more general cases where fees are necessary we have increased the performance of querying /blocks while also calculating fees. This was done with 2 cases: ensuring transactionPaidFee, and ExtrinsicSuccess or ExtrinsicFailure info is used to its fullest so we can avoid making any additional rpc calls, as well as ensuring the extrinsic's are called concurrently.

MD Islam · Answer 7 · Fri Dec 22 2023 05:23:16 GMT+0800 (China Standard Time)

What were the performance test outcomes?

We are using the new release & noFees param, but still see 2 second response times (major improvement from the 25-30 second responses) but it is substantially slower than the sub 1 second responses of the past.

Tarik Gul · Answer 8 · Fri Dec 22 2023 06:46:39 GMT+0800 (China Standard Time)

@exp0nge

We are using the new release & noFees param, but still see 2 second response times (major improvement from the 25-30 second responses) but it is substantially slower than the sub 1 second responses of the past.

What version were you using before you updated to 17.3.3? From 17.3.2 -> 17.3.3 performance has been the only change we made.

If I had to guess the reason you are seeing an increase in response time is because the average block size in terms of extrinsics has gone up dramatically. Just a day and a half ago the avg extrinsics size was probably in the low tens to single digits. Whereas now its averaging in the hundreds consistently.

But in terms of sidecar if you were to go test against older blocks you will see an increase in performance.

MD Islam · Answer 9 · Sat Dec 23 2023 07:44:14 GMT+0800 (China Standard Time)

@TarikGul

What version were you using before you updated to 17.3.3? From 17.3.2 -> 17.3.3 performance has been the only change we made.

We went from 17.3.2 -> 17.3.3 during the start of the day for this. So we were purely in it for the performance gain.

If I had to guess the reason you are seeing an increase in response time is because the average block size in terms of extrinsics has gone up dramatically. Just a day and a half ago the avg extrinsics size was probably in the low tens to single digits. Whereas now its averaging in the hundreds consistently.

Yeah, we noticed ordinal/inscription load on other networks too. However, the indirection with the api-sidecar has added an extra layer of complication since we're entirely reliant on it to translate between the Polkadot node.

--

We run the api sidecar within the same pod next to the polkadot node in AWS EKS. From being a tiny sidecar, we allocated 8 GB req/ 16 GB limit of memory while the node has significantly less 4 GB / 8 GB. This is the only way I could think to increase the concurrent performance given the sidecar continues to perform at a pretty slow response rate. This is OK when we're not behind chain tip, but incredibly bad if we do fall behind as there's only so much we can squeeze out of each pod. The node performance doesn't seem to have been impacted at all even though the sidecar node puts so much demand on it. That makes me think there's even more performance to be had here. We have both noFee & finalizedKey set.

If we can get more performance, we can be healthier here. I was originally looking at #1361 before the report here accelerated some of that. We're happy to provide any other insights here that might help the team. I do realize with the holidays, this might be a challenge though.