pgpartman / pg_partman

Partition management extension for PostgreSQL

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

long time to create partition tables

joyful88 opened this issue · comments

Postgresql 15.4 Partman 4.7.0 (SSD)
There is a table with 583 partitions by day. Starting to create the next partitions takes a very long time

2023-12-28 13:09:46 MSK [110811-3] 10.14.242.41(58786) streamer@vkstream CONTEXT: SQL statement "ALTER TABLE public."Post" ATTACH PARTITION public."Post_p2024_01_01" FOR VALUES FROM ('2024-01-0
1 00:00:00+03') TO ('2024-01-02 00:00:00+03')"
PL/pgSQL function create_partition_time(text,timestamp with time zone[],boolean,text) line 249 at EXECUTE
PL/pgSQL function partman.run_maintenance(text,boolean,boolean) line 284 at assignment
2023-12-28 13:09:46 MSK [110811-4] 10.14.242.41(58786) streamer@vkstream STATEMENT: SELECT partman.run_maintenance('public.Post');
2023-12-28 13:10:04 MSK [110811-5] 10.14.242.41(58786) streamer@vkstream LOG: process 110811 acquired AccessExclusiveLock on relation 182225334 of database 16622 after 18788.593 ms
2023-12-28 13:10:04 MSK [110811-6] 10.14.242.41(58786) streamer@vkstream CONTEXT: SQL statement "ALTER TABLE public."Post" ATTACH PARTITION public."Post_p2024_01_01" FOR VALUES FROM ('2024-01-0
1 00:00:00+03') TO ('2024-01-02 00:00:00+03')"
PL/pgSQL function create_partition_time(text,timestamp with time zone[],boolean,text) line 249 at EXECUTE
PL/pgSQL function partman.run_maintenance(text,boolean,boolean) line 284 at assignment
2023-12-28 13:10:04 MSK [110811-7] 10.14.242.41(58786) streamer@vkstream STATEMENT: SELECT partman.run_maintenance('public.Post');

Name |Value |
--------------------------+-----------------------------+
parent_table |public.Post |
control |timestamp |
partition_type |native |
partition_interval |1 day |
constraint_cols |NULL |
premake |14 |
optimize_trigger |4 |
optimize_constraint |30 |
epoch |none |
inherit_fk |true |
retention | |
retention_schema | |
retention_keep_table |true |
retention_keep_index |true |
infinite_time_partitions |true |
datetime_string |YYYY_MM_DD |
automatic_maintenance |on |
jobmon |true |
sub_partition_set_full |false |
undo_in_progress |false |
trigger_exception_handling|false |
upsert | |
trigger_return_null |true |
template_table |partman.vkstream_post_pattern|
publications |NULL |
inherit_privileges |false |
constraint_valid |true |
subscription_refresh | |
drop_cascade_fk |false |
ignore_default_data |false |

waiting:

Name |Value |
----------------+----------------------------------------------+
pid |110811 |
diff |00:28:01.531604 |
state |active |
query |SELECT partman.run_maintenance('public.Post');|
datid |16622 |
datname |vkstream |
pid |110811 |
leader_pid | |
usesysid |16385 |
usename |streamer |
application_name|psql |
client_addr |10.14.242.41 |
client_hostname | |
client_port |58786 |
backend_start |2023-12-28 13:59:59.471 +0400 |
xact_start |2023-12-28 14:00:16.939 +0400 |
query_start |2023-12-28 14:00:16.939 +0400 |
state_change |2023-12-28 14:00:16.939 +0400 |
wait_event_type |IO |
wait_event |DataFileRead |
state |active |
backend_xid |22577429 |
backend_xmin |22577151 |
query_id |1244527056530741779 |
query |SELECT partman.run_maintenance('public.Post');|
backend_type |client backend |

There is another table with 700+ partitions, but there is no such gap on it. The tables are similar, only the column names have been changed

The wait event seems to indicate it is an IO bound wait. Do you have any data in the default partition?

and so in a table of almost 8 TB in size, the default batches were about 480 GB, I cleared it after I started creating new batches, and they were created successfully.
there is also a problem with the call SELECT * FROM partman.partition_gap_fill('public.Post');
It seems like it should create missing partitions, but I waited for more than an hour and the request was not completed.
It seems that either the problem is in the data, or in its massiveness (there are a lot of them)

Just to clarify, there was data in the default?
If so, have you cleared out all of the data in the default table?
Any new table creation is going to be slower than normal until the default table has been emptied if there is a lot of data in it.

the step by day was +4 the command to create new partitions hung on the IO operation, given that the default contained data (430 GB). It was clear from the processes that there was work with the disk, but it was not clear what kind. The operation lasted about 6-8 hours, I then canceled it. Next, I transferred everything from the default table to the temporary table and cleared default. Afterwards, the operation of creating new paritias was completed in a couple of seconds.
I also wanted to create the missing partitions. the default table is cleared, and here is the operation SELECT * FROM partman.partition_gap_fill('public.Post'); behaved the same way, creating new partitions with an uncleaned default. I also worked with the disk but in fact did nothing

Sorry for the delay in response. I'm really not sure what is happening here if there is no data left in the default and normal maintenance seems to be working fine. Do you have an index on the control column?

Any chance you could capture the pg_stat_activity of the gap fill function while it's taking a long time to run? The wait events might help point in the right direction again.

So nice coincidence, this may be part of the problem with why the gap fill is taking so long

#606

Consider something like the following query to check the wait events:

WITH psa_pids(pid) AS (
  SELECT DISTINCT pids
  FROM
    pg_stat_activity psa
  , pg_blocking_pids(psa.pid) pbp
  , unnest(pid || pbp) AS pids
  WHERE cardinality(pbp) > 0
)
SELECT
    psa.pid
  , pg_blocking_pids(psa.pid)
  /* , state, state_change */
  , wait_event_type
  , wait_event
  , backend_type
  , pl.relation::regclass
  , pl.locktype
  , pl.mode
  , pl.granted
  , substr(query, 1, 10) as qry
FROM psa_pids
INNER JOIN pg_stat_activity psa USING (pid)
LEFT JOIN pg_catalog.pg_locks pl on psa.pid = pl.pid
WHERE pl.granted = false
ORDER BY pid, pl.relation::regclass, pl.locktype, pl.mode;