cybozu-go / moco

MySQL operator on Kubernetes using GTID-based semi-synchronous replication.

Home Page:https://cybozu-go.github.io/moco/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Backup cannot be taken when the last backup-ed Pod is not Ready

arosh opened this issue · comments

Describe the bug

When a job to take a backup of MySQL is executed, the backup job will fail if the Pod which the previous backup was taken from is not Ready. mysql-backup output the following error message.

Error: failed to choose source instance: failed to show master status: dial tcp: lookup moco-CLUSTERNAME-0.moco-CLUSTERNAME.MYNAMESPACE.svc on 10.xxx.yyy.zzz:53: no such host

Environments

  • Version: MOCO v0.16.1
  • OS: Flatcar Container Linux (stable)

To Reproduce

  1. Deploy MySQLCluster
  2. Take backup
  3. Stop the Pod that was backed up at 2. (.Status.Backup.SourceIndex)
  4. Take backup again

Expected behavior
Backups are taken from another Ready replica.

Additional context
The following statement in ChoosePod seems to be intended as a fallback if the last backup-ed Pod is not Ready.

https://github.com/cybozu-go/moco/blob/v0.16.1/backup/backup.go#L249

However, just before that, it tries to get the status of the last backup-ed Pod. Hence, if the Pod is not Ready, the backup will fail.

https://github.com/cybozu-go/moco/blob/v0.16.1/backup/backup.go#L216-L230