kruckenb / split_postgres_dump

Split output from Postgres pg_dump into separate files for schema and individual table data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Separate output from pg_dump into multiple files (dump.####.<name>)

Intended to maximize compression of de-dup'ing backup system

Theory: By separating table data into individual files, (1) de-dup'ing of
unchanged tables is very high, unaffected by other tables; (2) tables
with new (appended) records can still be highly compressed because
older records (positioned earlier in the dump) are de-dup'ed; (3) only
new tables and tables with changed rows have lower compression with
less de-dup'ing

Files are numbered to maintain order for reconstruction
First file (dump.0000.pre-schema) contains first part of schema information
Next files contain table data, in same order as pg_dump output
Last file (dump.####.post-schema) contains last part of schema information

Use: pg_dump | split-pg-dump.pl

To reconstruct complete dump (for restoring, etc)
  cat dump.* > dump.txt

About

Split output from Postgres pg_dump into separate files for schema and individual table data


Languages

Language:Perl 100.0%