zalando / patroni

A template for PostgreSQL High Availability with Etcd, Consul, ZooKeeper, or Kubernetes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

haproxy/confd integration broken with ZooKeeper

mkedwards opened this issue · comments

What happened?

I'm working on standing up a local containerized Patroni cluster for testing. I tend to prefer ZooKeeper over etcd for global state simply because I know how to deploy it scalably and it's relevant to other discovery needs in our environment. So I fixed a few of the issues around ZooKeeper in the Patroni container (removing util-linux breaks things), and am now here:

demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Backend set to zookeeper
demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Starting confd
demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Backend source(s) set to 'zk1:2181','zk2:2181','zk3:2181'
demo-haproxy   | panic: address 'zk1:2181','zk2:2181','zk3:2181': too many colons in address
demo-haproxy   | 
demo-haproxy   | goroutine 1 [running]:
demo-haproxy   | github.com/kelseyhightower/confd/backends/zookeeper.NewZookeeperClient(0xc4201cced0, 0x1, 0x1, 0x0, 0x0, 0xc4201e4240)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/zookeeper/client.go:20 +0xe0
demo-haproxy   | github.com/kelseyhightower/confd/backends.New(0x0, 0x0, 0x0, 0x0, 0x7ffc7c8a1d46, 0x9, 0x0, 0x0, 0x0, 0x0, ...)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/client.go:57 +0x107e
demo-haproxy   | main.main()
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/confd.go:28 +0xb6
demo-haproxy exited with code 2

confd appears to be dead upstream. Maybe there's a better way to automate haproxy config refresh when the set of nodes in the cluster changes?

How can we reproduce it (as minimally and precisely as possible)?

Launch patroni container in haproxy mode with something like:

PATRONI_ZOOKEEPER_HOSTS: "'zk1:2181','zk2:2181','zk3:2181'"

One needs zookeeper to actually be in the container:

--- a/Dockerfile
+++ b/Dockerfile
@@ -24,7 +24,7 @@ RUN set -ex \
     && apt-cache depends patroni | sed -n -e 's/.*Depends: \(python3-.\+\)$/\1/p' \
             | grep -Ev '^python3-(sphinx|etcd|consul|kazoo|kubernetes)' \
             | xargs apt-get install -y vim curl less jq locales haproxy sudo \
-                            python3-etcd python3-kazoo python3-pip busybox \
+                            python3-etcd python3-kazoo python3-pip zookeeper busybox \
                             net-tools iputils-ping dumb-init --fix-missing \
 \
     # Cleanup all locales but en_US.UTF-8
@@ -68,7 +68,7 @@ RUN set -ex \
     fi \
 \
     # Clean up all useless packages and some files
-    && apt-get purge -y --allow-remove-essential python3-pip gzip bzip2 util-linux e2fsprogs \
+    && apt-get purge -y --allow-remove-essential python3-pip gzip bzip2 e2fsprogs \
                 libmagic1 bsdmainutils login ncurses-bin libmagic-mgc e2fslibs bsdutils \
                 exim4-config gnupg-agent dirmngr \
                 git make \

I had to strip out the single quotes in the zkCli.sh invocation as well:

--- a/docker/entrypoint.sh
+++ b/docker/entrypoint.sh
@@ -20,7 +20,7 @@ case "$1" in
         haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D
         set -- confd "-prefix=$PATRONI_NAMESPACE/$PATRONI_SCOPE" -interval=10 -backend
         if [ -n "$PATRONI_ZOOKEEPER_HOSTS" ]; then
-            while ! /usr/share/zookeeper/bin/zkCli.sh -server "$PATRONI_ZOOKEEPER_HOSTS" ls /; do
+            while ! /usr/share/zookeeper/bin/zkCli.sh -server `echo "$PATRONI_ZOOKEEPER_HOSTS" | tr -d "'"` ls /; do
                 sleep 1
             done
             set -- "$@" zookeeper -node "$PATRONI_ZOOKEEPER_HOSTS"

What did you expect to happen?

A running haproxy container, with the haproxy.cfg populated by confd using the list of cluster members from ZK.

Patroni/PostgreSQL/DCS version

  • Patroni version: master (efdedc7) with above patches
  • PostgreSQL version: PostgreSQL 15.5 (Debian 15.5-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
  • DCS (and its version): zookeeper.version=3.9.1-1398af177833412e9ead6b9bb737dc9fd7418a45, built on 2023-10-04 09:54 UTC

(ZK servers are running the Bitnami zookeeper container; the Patroni container gets ZK client code via current Debian bookworm, which is 3.8.0-11+deb12u1.)

Patroni configuration file

whatever is in Patroni master; I'm just launching with docker-compose.

patronictl show-config

I don't think I can get to this without a running haproxy?

Patroni log files

The Patroni containers are up and agree on who's the leader.  Haproxy container crash:

demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Backend set to zookeeper
demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Starting confd
demo-haproxy   | 2023-12-08T01:01:13Z haproxy confd[40]: INFO Backend source(s) set to 'zk1:2181','zk2:2181','zk3:2181'
demo-haproxy   | panic: address 'zk1:2181','zk2:2181','zk3:2181': too many colons in address
demo-haproxy   | 
demo-haproxy   | goroutine 1 [running]:
demo-haproxy   | github.com/kelseyhightower/confd/backends/zookeeper.NewZookeeperClient(0xc4201cced0, 0x1, 0x1, 0x0, 0x0, 0xc4201e4240)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/zookeeper/client.go:20 +0xe0
demo-haproxy   | github.com/kelseyhightower/confd/backends.New(0x0, 0x0, 0x0, 0x0, 0x7ffc7c8a1d46, 0x9, 0x0, 0x0, 0x0, 0x0, ...)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/client.go:57 +0x107e
demo-haproxy   | main.main()
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/confd.go:28 +0xb6
demo-haproxy exited with code 2

PostgreSQL log files

Not there yet :)

Have you tried to use GitHub issue search?

  • Yes

Anything else we need to know?

I happen to be running docker / docker-compose on a Mac under Colima/vz/Rosetta so that I can run x86_64 builds of things, but I'm fairly sure that's not relevant.

I got the nested quotes convention from this line in docs/ENVIRONMENT.rst:

-  **PATRONI\_ZOOKEEPER\_HOSTS**: Comma separated list of ZooKeeper cluster members: "'host1:port1','host2:port2','etc...'". It is important to quote every single entity!

zkCli.sh doesn't like that, which is why I tweaked entrypoint.sh. I tried it without the single quotes and confd still crashes with the same error message (minus the quotes):

demo-haproxy   | WATCHER::
demo-haproxy   | 
demo-haproxy   | WatchedEvent state:SyncConnected type:None path:null
demo-haproxy   | [service, zookeeper]
demo-haproxy   | 2023-12-08T00:25:51Z haproxy confd[67]: INFO Backend set to zookeeper
demo-haproxy   | 2023-12-08T00:25:51Z haproxy confd[67]: INFO Starting confd
demo-haproxy   | 2023-12-08T00:25:51Z haproxy confd[67]: INFO Backend source(s) set to zk1:2181,zk2:2181,zk3:2181
demo-haproxy   | panic: address zk1:2181,zk2:2181,zk3:2181: too many colons in address
demo-haproxy   | 
demo-haproxy   | goroutine 1 [running]:
demo-haproxy   | github.com/kelseyhightower/confd/backends/zookeeper.NewZookeeperClient(0xc4201e6c60, 0x1, 0x1, 0x0, 0x0, 0xc4201304c0)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/zookeeper/client.go:20 +0xe0
demo-haproxy   | github.com/kelseyhightower/confd/backends.New(0x0, 0x0, 0x0, 0x0, 0x7fff40688d52, 0x9, 0x0, 0x0, 0x0, 0x0, ...)
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/backends/client.go:57 +0x107e
demo-haproxy   | main.main()
demo-haproxy   | 	/go/src/github.com/kelseyhightower/confd/confd.go:28 +0xb6
demo-haproxy exited with code 2

Zookeeper "integration" (if we speak about docker-compose) never evolved from supporting more than a single Zookeeper node.
Probably that's why you have all sorts of issues with zkCli.sh and confd. The last one is supposed to get list of nodes not as a single -node argument, but as a set of -node value pairs.

Thank you for the note on how confd expects to receive multiple ZK bootstrap nodes. I'm now up and running with this:

#2985