toddsundsted / ktistec

Single user ActivityPub (https://www.w3.org/TR/activitypub/) server.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upgrading does not work

winks opened this issue · comments

I tried to upgrade from a dist version from ~2022-04-26 to 2.0.0-5 and there seem to be some db migrations missing or not documented well:

Unfortunately so far the only message I got was:

Exception while running rules: more than one row (DB::Error)
  from ???
  from ???
  from ???
  from ???
  from ???
  from ???
  from ???
  from ???
  from src/env/__libc_start_main.c:94:2 in 'libc_start_main_stage2'

but maybe the db changes are small enough to just manually do the migration, I'll look further into the code.

@winks you might be running into #45 (comment)

if so, did the migration and server startup continue after the error (it should have)? i made modifications after the first error report to log the error and continue. it's in a migration that rebuilds the timeline and notifications (vs. changing the schema of the database). how many of these errors did you get?

I didn't count, but 20ish for sure.

Meanwhile I rebuilt without --no-debug, like crystal build src/ktistec/server.cr --static and after running that twice it seems to have fixed itself (third or fourth try overall), so maybe it was indeed fixing itself, but slowly.

Sorry for the noise, I'll keep an eye on it - but right now it looks ok.

thanks! did you get the same number of errors each time, or did it vary?

That's lost to the terminal history, sorry. I manually scrolled back and saw that it was the same error, then rolled back to the old version - didn't really count or save it, but maybe I can replay from backup and count later.

Reran with the old db dump earlier,

$ grep Batch asdf | wc -l         
36
$ grep Exception asdf | sort | uniq -c
   2003 Exception while running rules: more than one row (DB::Error)

I was patient and at some point the log had:

ktistecdev_1  | Batch 36 complete
ktistecdev_1  | update-timeline-and-notifications: applied in 246.2214s
ktistecdev_1  | add-indexes-on-actor-iri-and-target-iri-to-activities: applied in 0.9925s
ktistecdev_1  | [development] Ktistec is ready to lead at http://0.0.0.0:3000

and it went just fine. So I guess the main problem is the stack trace, and that there are so many of them so a user who doesn't look at the log in detail fails, like I did.

@winks see this issue before you upgrade: #55

you might also have duplicate rows in your actors and objects table—in fact, given this error i suspect you do. recent changes add a uniqueness constraint in the database to prevent this, but you will need to explicitly delete the duplicate before you can apply it.

FYI, I had the same issue with 2.0.0-6 (not with 2.0.0-5 for some reason, though). The info from #55 (comment) cleared the issue and the docker-image started successfully afterwards.

Might be a good idea to have a periodic clean-up process in ktistec? Or is the risk of data-loss too high in that case?

that could always be managed with a backup. aside for cases like the one that led to the duplicates, which are going to cause problems (like it did), i'm trying to make the database resilient to garbage, if for no other reason than the fact that my database, which is about two years old now, probably is full of it!