Long running queries and partman.run_maintenance_proc

Question

Long running queries and partman.run_maintenance_proc

mattpoel opened this issue 6 months ago · comments

Hi there,

I'm facing an issue with a lock problem or throughput bottleneck while using run_maintenance_proc in combination with partman. Can someone provide any suggestions to help me out?

We are partitioning a large chunk of data using partman, which has been working great for us. However, we are experiencing a problem in a specific scenario:

pg_cron triggers partman.run_maintenance_proc every hour.
Multiple jobs load data with short transactions.
We have a long-running query that takes more than an hour to complete using READ COMMITTED.
This block partman.run_maintenance_proc until the long-running query is finished, which is okay as we have pre-made partitions.
Unfortunately, this also slows down data loading jobs with short transactions, resulting in a backlog until either the long-running query is finished or we terminate partman.run_maintenance_proc.

I'm curious about why the INSERTs are getting slower, sometimes by a tenth of the normal throughput, and I'm looking for suggestions on how to avoid this problem, apart from not using READ COMMITTED or running maintenance when no long-running query is active, which is not easy to predict.

Thank you in advance!

Keith Fiske · Answer 1 · Mon Feb 12 2024 22:56:49 GMT+0800 (China Standard Time)

Not fully sure why the maintenance job is slowing down the inserts. What version of pg_partman are you running?

Since you are using cron to run maintenance vs using the background worker, that does give you one other option. You could write your own custom function to run partition maintenance vs calling it directly. That custom function could check pg_stat_activity to see if the known long running query is running and, if so, not run partition partition maintenance at that time.

Keith Fiske · Answer 2 · Mon Feb 26 2024 22:45:18 GMT+0800 (China Standard Time)

Just checking to see if this answered helped at all before closing this issue