Server: broken journal after hard reset
BitHeaven-Official opened this issue · comments
Hard reset broke journal
When the server suddenly shuts down, the skytable journal breaks.
Steps to reproduce
Steps to reproduce the behavior:
- Run
skyd
- Disconnect the server from the power supply or just press the reset button
- Turn on the server
- Journal is broken and
skyd
no longer runs
Expected behavior
I expected it not to give an error, but just to restart
Meta
- Release tag:
v0.8.0
- Branch:
v0.8.0
- Commit ID:
41e091cd0f6861cbaca2c6d73e023f698ec3f1a8
- Operating system: openSUSE Tumbleweed aarch64
Additional context
Work state:
Mar 21 23:44:23 secretbase skyd[5617]: ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
Mar 21 23:44:23 secretbase skyd[5617]: ███████ █████ ████ ██ ███████ ██████ ██ █████
Mar 21 23:44:23 secretbase skyd[5617]: ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
Mar 21 23:44:23 secretbase skyd[5617]: ███████ ██ ██ ██ ██ ██ ██ ██████ ███████ ███████
Mar 21 23:44:23 secretbase skyd[5617]: Skytable v0.8.0 | https://github.com/skytable/skytable
Mar 21 23:44:23 secretbase skyd[5617]: [2024-03-21T15:44:23Z WARN skyd::engine] running in dev mode
Mar 21 23:44:23 secretbase skyd[5617]: [2024-03-21T15:44:23Z INFO skyd::engine] starting storage engine
Mar 21 23:44:23 secretbase skyd[5617]: [2024-03-21T15:44:23Z INFO skyd::engine::storage] initializing databases
Mar 21 23:44:23 secretbase skyd[5617]: [2024-03-21T15:44:23Z INFO skyd::engine] storage engine ready. initializing system
Mar 21 23:44:23 secretbase skyd[5617]: [2024-03-21T15:44:23Z INFO skyd::engine] listening on tcp@0.0.0.0:2003
After hard reset:
███████ ██ ██ ██ ██ ████████ █████ ██████ ██ ███████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
███████ █████ ████ ██ ███████ ██████ ██ █████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
███████ ██ ██ ██ ██ ██ ██ ██████ ███████ ███████
Skytable v0.8.0 | https://github.com/skytable/skytable
[2024-03-21T15:55:40Z WARN skyd::engine] running in dev mode
[2024-03-21T15:55:40Z INFO skyd::engine] starting storage engine
[2024-03-21T15:55:40Z WARN skyd::engine::storage] older storage format detected
[2024-03-21T15:55:40Z INFO skyd::engine::storage] loading data
[2024-03-21T15:55:40Z ERROR skyd] storage error error: loading storage-v1 in compatibility mode; storage error: journal-corrupted
Same in prod
mode
There is no obvious solution to this error. The only thing to do is to allow explicit repair (which IMO is something that should definitely be added) instead of the system that we currently have in place. Also, auto recovery based on severity should be provided (i.e configurable) on the user end.
I'm done working on the recovery system. I'll add a few more tests and then we should be good to go.
PR is up