benbjohnson / litestream

Streaming replication for SQLite.

Home Page:https://litestream.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update to 0.4.0-beta.2: no snapshots available on restore

ohthehugemanatee opened this issue · comments

I had an amd64 node die in my cluster and had to migrate some litestream-dependent workloads to arm64. Looks like there's no arm64 build of 0.3.8 available, so I had to run 0.4.0-beta.2 instead.

But on first run on the new node, the init container fails to restore. Apparently it can't see any replicas.

#>  litestream restore /db/sonarr.db
cannot determine latest generation: generation time bounds: no snapshots available

#> litestream snapshots /db/sonarr.db
replica  generation  index  size  created

The backup directory has plenty of data:

#> tree /db.backup/sonarr.db
/db.backup/sonarr.db
└── generations
    └── e7d6ea14e0477efe
        ├── snapshots
        │   └── 0001bd38.snapshot.lz4
        └── wal
            ├── 0001bd38_00000000.wal.lz4
            ├── 0001bd38_00000438.wal.lz4
            ├── 0001bd38_000049d0.wal.lz4
            ├── 0001bd38_00005618.wal.lz4
             ...
            ├── 0001bd98_00009380.wal.lz4
            └── 0001bd98_0000a7f8.wal.lz4

4 directories, 396 files

What am I missing?

I just tried with a 0.3.8 container running on amd64, and it worked fine. So, validated this is a regression or some missing upgrade documentation. Here's my complete setup:

Sidecar container to replicate:

        - name: sonarr-litestream
          image: litestream/litestream:0.3.8
          args: ['replicate']
          volumeMounts:
          - name: sonarr-db
            mountPath: /db
          - name: nfs
            subPath: ".docker/config/sonarr/db-backup.litestream"
            mountPath: /db.backup
          - name: litestream-config
            mountPath: /etc/litestream.yml
            subPath: litestream.yml

litestream.yml:

dbs:
  - path: /db/sonarr.db
    replicas:
      - path: /db.backup/sonarr.db

Now trying to restore from a one-off container.

❯ k run -it --tty litestream --image=litestream/litestream:0.4.0-beta.2 --overrides='
{
  "spec": { "nodeSelector": { "kubernetes.io/arch": "amd64" },
    "volumes": [
      {
        "name": "nfs",
        "persistentVolumeClaim": {
          "claimName": "nfs-claim"
        }
      },
      {
        "name": "litestream-config",
        "configMap": {
          "name": "sonarr-litestream"
        }
      }
    ],
    "containers": [
      {
        "name": "init-litestream",
        "image": "litestream/litestream:0.4.0-beta.2",
        "command": ["sleep"], "args": ["1200"], "volumeMounts": [
          {
            "name": "nfs",
            "subPath": ".docker/config/sonarr/db-backup.litestream",
            "mountPath": "/db.backup"
          },
          {
            "name": "litestream-config",
            "mountPath": "/etc/litestream.yml",
            "subPath": "litestream.yml"
          }
        ],
        "stdIn": true,
        "stdinOnce": true,
        "tty": true
      }
    ]
  }
}'  bash

It only succeeds if that restore container is running the same version as created the backups in the first place.

But then... how am I supposed to change my setup to 0.4.0? Is there an upgrade script necessary?

@ohthehugemanatee The replica directory structure changed in v0.4.0. You won't be able to restore a backup from v0.3.x but when you start up v0.4.0, it should see that there are no snapshots available and issue a new one and begin replicating from there.

Looks like there's no arm64 build of 0.3.8 available, so I had to run 0.4.0-beta.2 instead.

What OS are you running? There's some arm64 builds for Linux for v0.3.8: https://github.com/benbjohnson/litestream/releases/tag/v0.3.8

Thanks for answering! I'm on kubernetes (so, Linux containers), and I don't see any container builds for 0.3.8 on arm64. Any chance you could run the multi-arch docker pipeline on 0.3.8 once so we get one?

when you start up v0.4.0, it should see that there are no snapshots available and issue a new one and begin replicating from there.

The problem is, if you use litestream as "a simple persistence layer for Kubernetes services", you are effectively stuck on 0.3.x forever. By definition every container starts without state; the database is created by litestream restore. If you update to 0.4 that step will fail, and you have an empty db... nothing to replicate from.

I guess you have to exec into a running container with the DB, install a copy of litestream 0.4, and run the first replication from there. Does it break the existing 0.3.8 backup store if 0.4.0 writes over the same location?

I am also seeing this when attempting to restore from S3.

If I run this command on v0.3.8 on my local machine, it works:

litestream restore --config litestream/primary.yml -replica s3 -o mydb.db /data/mydb

But v0.4.0-beta2 complains:

cannot determine latest generation: generation time bounds: no snapshots available

This is indeed a bit more sinister than initially thought. I didn't quite understand it first that "it should see that there are no snapshots available and issue a new one and begin replicating from there" meant the replication target is considered empty and assumes the latest local file exists.

We're also going to use Litestream on Kubernetes with ephemeral mounts and recently I switched over from 0.3.9 to master to start tracking 0.4.0 before it's released. When switching over to the new version we also hit the "no snapshots available" issue but since it's not yet in production it wasn't as critical as it would have been for someone who had.

Even though we can sidestep this issue by skipping 0.3.x entirely it would likely be a good idea to add support during restore for looking in the old 0.3.x structure if 0.4.x ones turn empty. The very least a big warning in 0.4.0 release notes that restoring from a 0.3.x snapshot is not supported and it's a breaking change so at least people have a heads up.

Thanks!

As the old main branch was scrapped I'll close this as it's not relevant anymore.