hyphacoop / api.distributed.press

Home Page:https://distributed.press

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PUTing tarball takes a long time

fauno opened this issue · comments

commented

Synchronizing a 44MB site takes ~5m, I'm attaching logs.

CMS side:

{"@timestamp":"2023-02-08 20:10:53 +0000","@version":1,"content_length":"718","http_method":"GET","message":"[HTTParty] 200 \"GET /v1/sites/compost.testing.sutty.nl\" 718 ","path":"/v1/sites/compost.testing.sutty.nl","response_code":200,"severity":"info","tags":["HTTParty"]}
{"@timestamp":"2023-02-08 20:15:45 +0000","@version":1,"content_length":"0","http_method":"PUT","message":"[HTTParty] 200 \"PUT /v1/sites/compost.testing.sutty.nl\" 0 ","path":"/v1/sites/compost.testing.sutty.nl","response_code":200,"severity":"info","tags":["HTTParty"]}

DP side:

{"time":"15:10:58","reqId":"req-2p","req":{"method":"GET","url":"/v1/sites/compost.testing.sutty.nl"},"msg":"incoming request"}
{"time":"15:10:58","reqId":"req-2p","res":{"statusCode":200},"responseTime":3.6780920028686523,"msg":"request completed"}

{"time":"15:11:59","reqId":"req-2q","req":{"method":"PUT","url":"/v1/sites/compost.testing.sutty.nl"},"msg":"incoming request"}
{"time":"15:11:59","reqId":"req-2q","msg":"Downloading tarfile for site"}
{"time":"15:11:59","reqId":"req-2q","msg":"Processing tarball: /tmp/4a1dad813f85a909.gz"}
{"time":"15:11:59","reqId":"req-2q","msg":"Deleting old files"}
{"time":"15:11:59","reqId":"req-2q","msg":"Extracting tarball"}
{"time":"15:12:01","reqId":"req-2q","msg":"Performing sync with site"}
{"time":"15:12:01","reqId":"req-2q","msg":"[hyper] Sync Start"}
{"time":"15:12:01","reqId":"req-2q","msg":"[ipfs] Sync Start"}
{"time":"15:12:05","reqId":"req-2q","msg":"[hyper] Published: hyper://dc17wdupgqk75men4p8ywtimeq7fbajutnsa3j1997ni1s6in6py/"}

{"time":"15:15:46","reqId":"req-2q","msg":"[ipfs] Sync start"}
{"time":"15:15:46","reqId":"req-2q","msg":"[ipfs] Generated key: k51qzi5uqu5dmgqcsmmd4y5g717crmhzep30z88ic3cgc1jy32ufxpj0j0ybj3"}
{"time":"15:15:46","reqId":"req-2q","msg":"[ipfs] Got root CID: bafybeig367my34zdt77krv27zhliocol6iqpooryxkekptcdv2vzjxzs6m, performing IPNS publish (this may take a while)..."}
{"time":"15:15:49","reqId":"req-2q","msg":"[ipfs] Published to IPFS under k51qzi5uqu5dmgqcsmmd4y5g717crmhzep30z88ic3cgc1jy32ufxpj0j0ybj3: /ipfs/bafybeig367my34zdt77krv27zhliocol6iqpooryxkekptcdv2vzjxzs6m"}
{"time":"15:15:49","reqId":"req-2q","msg":"Finished sync"}
{"time":"15:15:49","reqId":"req-2q","res":{"statusCode":200},"responseTime":230262.7318353653,"msg":"request completed"}

{"time":"15:17:08","reqId":"req-2r","req":{"method":"GET","url":"/v1/sites/compost.testing.sutty.nl"},"msg":"incoming request"}
{"time":"15:17:08","reqId":"req-2r","res":{"statusCode":200},"responseTime":3.9629344940185547,"msg":"request completed"}

I've grouped log lines by time proximity, converted to readable timestamps and removed some unrelated info.

req-2p is the client sending a request to update site local data (links, enabled protocols), req-2q is started immediately after req-2p but is logged a minute later when the tarball finishes being received. All log lines are sent together when the tarball finishes being extracted, so I think there's something blocking there.

Then nothing happens for some minutes, on htop there's high CPU usage from ipfs daemon and node processes, and a new block of log lines and the request finishes. I think the block is intended to be logged at different times too, and ipfs sync starts twice!

I'm not sure what's req-2r! I'm only making two requests.

Investingating this seems to happen when there's two requests to publish happening next to each other.

Gonna add unit tests to try to simulate this

commented

I was performing only one publish process at a time and no one else was using it though

I'm guessing this is related to MFS flushing taking way too long. Gonna try flushing just once at the end of a sync.

Is this still an issue?

commented

I'll test now

commented

It took 65s now for sutty.nl which is 269MB, so I'd say it's a lot better!

Cool, I'm about to update some of the RAM usage stuff and see if it helps further.

We've got this down to a few seconds now, I think the AcceleratedDHTClient is what ultimately fixed the issues we had.