Hanging pgPool service on production
KarolLipnicki opened this issue · comments
- OS: Red Hat Enterprise Linux release 9.3 (Plow)
- pgPool versions:
- pgpool-II-pg15.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-pg15-debuginfo.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-pg15-debugsource.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-pg15-devel.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-pg15-extensions.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-pg15-extensions-debuginfo.x86_64 4.4.5-1pgdg.rhel9 @pgpool44
- pgpool-II-release.noarch 4.4-1 @@commandline
On two servers, server #1 and server #2 we databases database1. Sometimes, I don't know why - but in log is then visible:
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281883: ERROR: unable to read data from frontend
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281883: DETAIL: EOF encountered with frontend
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281873: ERROR: unable to read data from frontend
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281873: DETAIL: EOF encountered with frontend
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281875: ERROR: unable to read data from frontend
2024-03-11 09:37:07.020: PostgreSQL JDBC Driver pid 3281875: DETAIL: EOF encountered with frontend
2024-03-11 09:37:07.021: PostgreSQL JDBC Driver pid 3281886: ERROR: unable to read data from frontend
2024-03-11 09:37:07.021: PostgreSQL JDBC Driver pid 3281886: DETAIL: EOF encountered with frontend
Postgresql in all time is running on both servers in this time but command
psql -h localhost -p 9999 -U pgpool postgres -c "show pool_nodes"
hangs and remote clients can't execute their queries
PgPool config file: pgpool.conf.txt
Postgres db config file:
postgresql.conf.txt
My guess why psql hangs is, your configurations does not satisfy the requirement described in the docs:
https://www.pgpool.net/docs/44/en/html/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SETTINGS
max_pool*num_init_children <= (max_connections - superuser_reserved_connections) (no query canceling needed)
max_pool*num_init_children*2 <= (max_connections - superuser_reserved_connections) (query canceling needed)
In your configurations, max=connections = 1000, superuser_reserved_connections = 3, num_init_children = 300, max_pool = 4.
max_pool*num_init_children = 300 * 4 = 1200
max_connections - superuser_reserved_connections = 1000 - 3 = 997
This does not satisfy the formula even if you do not allow query cancel.
Regarding the error messages, it means that the clients disconnected to pgpool without a terminate message, which is required by the PostgreSQL protocol. The causes could be application bugs, client process gets killed or the network connections between clients and pgpool was broken. You should examine application logs or server logs.
@pengbo0328 @tatsuo-ishii i've changed it to:
PgPool config:
num_init_children = 300
Postgres config:
max_connections = 1000
How should I configure it?
You need to satisfy the formula in the docs. For example,:
num_init_children = 200
max_pool = 4
max_connections = 1000
superuser_reserved_connections = 3
(if no query cancel is required)
I have set like this above and application generates ~400 idle connections. I don't know how to handle this and what to do :(
application generates ~400 idle connections
What do you exactly mean by this? "state" column from pg_stat_activity?
Yes, in pg_stat_activity I see 400 connections in state "Idle"
That's normal. pgpool creates connections to PostgreSQL and keeps them for next use. This is the basic idea of connection pool.
So this connections amount is OK.
That configuration will be OK?
pgpool.conf
num_init_children = 200
max_pool = 4
connection_life_time = 30
client_idle_limit = 30
postgresql.conf
max_connections = 1200
superuser_reserved_connections = 3
Yes, looks good to me.
Thank You for your effort @tatsuo-ishii :)