Host media export failed due to "context deadline exceeded"
jaywink opened this issue · comments
An export for host media ran for a while and then ended up failing with:
time="2023-12-18 12:13:52.195 Z" level=error msg="Error during export: Get "https://redacted.s3.dualstack.eu-north-1.amazonaws.com/long-random-id-looking-thing-here\": context deadline exceeded" internal_flag=1 task_id=4
Possibly an error downloading from the S3 bucket?
Unfortunately the export is not marked as failed in any way. Polling the task shows the following information:
{
"task_id": 4,
"task_name": "export_data",
"params": {
"export_id": "export-id-redacted",
"include_s3_urls": false,
"server_name": "domain-redacted"
},
"start_ts": 1702901122746,
"end_ts": 1702901632195,
"is_finished": true
}
It's not possible for a script to know whether the export finished successfully or not.
Idea: maybe a status
field with something like success
, error
etc, or some other field to indicate whether the export finishes successfully?
The logs on a retry with debug logging have a little confusing ordering 🤔
time="2023-12-18 13:48:15.045 Z" level=info msg="Task 'export_data' completed" internal_flag=1 task_id=6
time="2023-12-18 13:48:15.042 Z" level=error msg="Error during export: context deadline exceeded" internal_flag=1 task_id=6
time="2023-12-18 13:48:14.985 Z" level=debug msg="Writing tar file to gzip container: export-manifest.tar" internal_flag=1 task_id=6 v2archive-entity=domain.tld v2archive-id=export-id
time="2023-12-18 13:48:13.532 Z" level=debug msg="Writing tar file to gzip container: export-part-3.tar" internal_flag=1 task_id=6 v2archive-entity=domain.tld v2archive-id=export-id
time="2023-12-18 13:48:13.532 Z" level=debug msg="Finishing export archive" internal_flag=1 task_id=6
time="2023-12-18 13:47:13.530 Z" level=debug msg="Getting whole cached object for bc4663ed5d156254cb2443c7b665...." internal_flag=1 task_id=6
time="2023-12-18 13:47:13.526 Z" level=debug msg="Downloading mxc://domain.tld/745a3df95f1332106053c3..." internal_flag=1 task_id=6
time="2023-12-18 13:47:13.144 Z" level=debug msg="Getting whole cached object for cdbe2ac9fa0e3c9f8e9f7c664b18e46ca..." internal_flag=1 task_id=6
time="2023-12-18 13:47:13.140 Z" level=debug msg="Downloading mxc://domain.tld/3e17e65440100d8be78c5..." internal_flag=1 task_id=6
time="2023-12-18 13:47:12.644 Z" level=debug msg="Getting whole cached object for 0e9f5108a779ba5dd8fdf7e32adcf18..." internal_flag=1 task_id=6
Given the two export tasks produced the exact identical tar files from both jobs, it's probably a single file it crashes on?
This is approx 10% of the media usage for this host (reported by mediarepo admin API).
Fixed by #508