zalando / patroni

A template for PostgreSQL High Availability with Etcd, Consul, ZooKeeper, or Kubernetes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Consul service checks failing when TLS is enabled on Patroni Rest API

ponvenkates opened this issue · comments

What happened?

I have configured TLS on Patroni Rest API and it works fine. But when I enable consul register_service, the check is failing continuously (attached screenshot).
Screenshot 2024-01-17 at 1 21 17 AM

In Patroni logs, i see bad certificate error.

Exception in thread Thread-13 (process_request_thread):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/site-packages/patroni/api.py", line 1631, in process_request_thread
    request.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1371, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_BAD_CERTIFICATE] sslv3 alert bad certificate (_ssl.c:1007)

Once i set register service to false the error disappeared.

How can we reproduce it (as minimally and precisely as possible)?

  1. Enable TLS on REST API by configuring the certificates
  2. Enable register service in consul to true.
  3. Verify the service health in consul.

What did you expect to happen?

  1. Service checks should pass
  2. Service should turn healthy.

Patroni/PostgreSQL/DCS version

  • Patroni version: 3.1.2
  • PostgreSQL version: 13.12.0
  • DCS (and its version): consul (15.2)

Patroni configuration file

bootstrap:
  initdb:
  - encoding: UTF8
  - data-checksums

  post_init: /post_init.sh

  pg_hba:
  - host all all 0.0.0.0/0 md5
  - local all all    md5

consul:
  service_tags: ['node-2']

postgresql:
  _comment: postgres settings
  

log:
  _comment: Log configuration
  level: INFO
  traceback_level: ERROR

patronictl show-config

check_timeline: true
failsafe_mode: true
log:
  level: INFO
loop_wait: 10
master_start_timeout: 60
maximum_lag_on_failover: 1048576
pg_hba:
- local all all    md5
- host all all 0.0.0.0/0 md5
postgresql:
  parameters:
    max_connections: 150
    max_replication_slots: 5
    max_wal_senders: 20
    max_worker_processes: 1
    unix_socket_directories: /tmp
    wal_keep_segments: 512
    wal_level: logical
  pg_hba:
  - local all all    md5
  - host all all 0.0.0.0/0 md5
  remove_data_directory_on_diverged_timelines: true
  remove_data_directory_on_rewind_failure: true
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
ttl: 30

Patroni log files

Exception in thread Thread-13 (process_request_thread):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/site-packages/patroni/api.py", line 1631, in process_request_thread
    request.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1371, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_BAD_CERTIFICATE] sslv3 alert bad certificate (_ssl.c:1007)

PostgreSQL log files

Irrelevant

Have you tried to use GitHub issue search?

  • Yes

Anything else we need to know?

No response

Consul can't use client certificates for health check requests.
That's not a Patroni problem.

@CyberDem0n But as per the documentation if PATRONI_REST_API_VERIFY_CLIENTS is set to 'none', Patroni shouldn't fail any requests with "bad certificate" error. Isn't that right?

@ponvenkates there is nothing in your original message explaining how you configured restapi.

Sorry about that. I have configured the api using environment variables.

PATRONI_RESTAPI_CONNECT_ADDRESS="Node IP:8008"
PATRONI_RESTAPI_LISTEN="0.0.0.0:8008"
PATRONI_RESTAPI_USERNAME="admin"
PATRONI_RESTAPI_PASSWORD="admin password"
PATRONI_RESTAPI_CERTFILE="server cert file location"
PATRONI_RESTAPI_KEYFILE="server key file location"
PATRONI_RESTAPI_CAFILE="root cert file location"

PATRONI_RESTAPI_VERIFY_CLIENT is not set leaving it to default 'none' as per the document.

This error could be an indicator that something is wrong with certificates (which is again not a Patroni problem).
Please try to access Patroni REST API with curl -k and see if it works.