wal-e / wal-e

Continuous Archiving for Postgres

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Different retention for base backups and WAL segments

afn opened this issue · comments

My understanding (correct me if I'm wrong) is that backups can be recovered using just a base backup, and the WAL segments are only needed to restore to an intermediate state between base backups.

Running wal-e delete retain or wal-e delete before deletes both base backups and WAL segments matching the specified criteria. But it would be useful to be able to have different granularity over different time periods, so that the users who have, say, a scheduled say weekly base backup can do fine-grained recovery (using base backup + WAL segments) for recent history and coarse-grained (using only base backups) for older archives.

I propose adding a new option to wal-e delete, --wal-only. Then a retention policy can be enforced by running, say:

wal-e delete retain 52             # delete everything after 52 weeks
wal-e delete --wal-only retain 8   # delete WAL files after 8 weeks

Thoughts?

I tried doing exactly this through a custom script using 'gsutil', I was using Google Cloud Storage for storage. The problem is that base backups by themselves aren't sufficient for recovery, they need to have the corresponding WAL segments generated during the process of making the backup. The WAL segments required by each base backup can be found out by using the --detail option in the 'backup-list' subcommand. So, while deleting WAL segments we'll have to keep track of each of these segments for each base backup still present.

I ended up setting up a script to copy the the older base backups and the corresponding WAL segments to another directory and then running the 'delete retain 8' command on the main backup location. Wal-e can successfully restore from both locations as long as all ENV variables are set correctly.

@rohit-smpx If just a base backup is not enough for a consistent backup, how to restore "snapshot" of the database?

@damirda When a base backup is taken, the WAL files generated during the time when the backup process is started and when it ends are also needed to restore from a base backup. The range of the WAL files corresponding to each base backup can be found using the --detail option in the backup-list operation.

From the readme(https://github.com/wal-e/wal-e#backup-list):

wal_segment_backup_start

The wal segment number. It is a 24-character hexadecimal number. This information identifies the timeline and relative ordering of various backups.

wal_segment_backup_stop

The last WAL segment file required to bring this backup into a consistent state, and thus available for hot-standby.

So, if you have all the WAL segments in between this range, and the base backup then it will be consistent.

Yes, this help, especially in combination with recovery_target = 'immediate' in recovery.conf. Thank you!

What about an additional option like:

 wal-e delete retain periodic <num_of_days> <num_of_weeks> <num_of_months>
 # num_of_days is how many days to keep at most one base backups per day, overrides num_of_weeks and num_of_months
 # num_of_weeks is how many weeks to keep at most one base backup per week, overrides num_of_months
 # num_of_months is how many months to keep at most one base backup per month

So if someone wanted to keep daily backups for a week, weekly backups for a month, and monthly backups for a year, they could have a cron like this:

0 0 * * * wal-e delete retain periodic 7 4 12

I could probably get a PR going this weekend for this if anyone thinks this is possible and would be performant without digging too deep into storing additional metadata. I'm hesitant to open a PR that modifies how the files are stored in S3 so I'd rather avoid that and just use existing available metadata.

@ScottKelly Your description above is exactly what I am looking for. Did you implement something to do that in your environment?