cybertec-postgresql / pg_squeeze

A PostgreSQL extension for automatic bloat cleanup

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ability for Redefinition of table.

jobinau opened this issue · comments

As the Partitioning feature in PG12+ is great, there is a growing requirement among users to have a tool for converting a Non-partitioned table into a partitioned table. Currently, users do it with lot of downtime. Since pg_sqeeze has all the features to copy the data over to the new table definition, will it be possible to use it copying and swapping the table?
I hope if the user is allowed to specify a custom created transient_table, it should be achievable.
Sorry for my poor understanding of the code base and if I am completely wrong.

+1 for the idea. Probably user still needs to create the the table beforehand already though with desired partitioning in place. @ahouska do you think it would be doable?

Yes Agree, @kmoppel , User should be allowed to create a table as he/she wants beforehand with partitioning or whats soever. That opens up different possibilities and use-cases.
Another use case is for this feature is Online schema changes. For example, I want to convert a column with "NUMERIC" datatype to "INT"

The ability to change table definition was already proposed, see

#15
and
#18

pg_squeeze relies on the fact that definition of the table it's currently processing does not change - it checks the system catalog several times and aborts if any related catalog change is detected. If pg_squeeze performed catalog changes itself, it'd become significantly more complex.

As for the partitioning, it'd perhaps be doable (although not trivial), however size and rate of DMLs of the original (not-yet-partitioned) table might be a problem. I've seen at least one case where pg_squeeze could not finish table processing until the host ran out of disk space. The point is that pg_squeeze needs to prevent WAL archiving until the phase called "initial load" is done. However this initial load can take hours, and if enough data changes are performed, the amount of WAL files accumulated can become too high.

In other words, pg_squeeze might encounter problems when processing huge tables, but tables that users want to make partitioned are supposed to be huge.

Perhaps this blog post can be useful if you want to turn non-partitioned table into partitioned one without noticeable outage: https://blog.hagander.net/repartitioning-with-logical-replication-in-postgresql-13-246/