Clean up expired SSL certificates in order to prevent web timeouts

Question

Clean up expired SSL certificates in order to prevent web timeouts

yeah opened this issue 9 months ago · comments

We have 53 domains and are running MIAB since 2017. By now, our /home/user-data/ssl had grown to over 1.000 old expired certificate files.

Since get_ssl_certificates() from management/ssl_certificates.py is looping over all of these old certs, reading and analyzing them at least once per domain for most web requests of the management interface, we had a situation where the web interface had become unusable due to request timeouts (like in #1966). Another result of this was that DNSSEC RRSIG updates were not working properly anymore resulting in our domains not being resolved properly anymore by many DNS servers. (This is due to /etc/cron.daily/mailinabox-dnssec using ~/tools/dns_update which in turn makes web requests.

The simple make-it-work-now solution is to clean up /home/user-data/ssl for old files.

But there should be a mechanism in MIAB that removes obsolete certificates or at least disregards them when looping in get_ssl_certificates().

jvolkenant · Answer 1 · Wed Oct 18 2023 06:23:01 GMT+0800 (China Standard Time)

To a degree, that's probably already done, but likely the time it takes to go through all the certs still exceeds the proxy timeout. Like I mentioned in #1966, extending the proxy timeout might fix this. Or as you say deleting old certs. If that isn't added in ssl_certificates.py, something like find /home/user-data/ssl/*-*.pem -maxdepth 1 -mtime +365 -delete may work

Runar Ingebrigtsen · Answer 2 · Thu Oct 26 2023 18:10:01 GMT+0800 (China Standard Time)

I deleted old certs and that made the status panel appear again.

jvolkenant · Answer 3 · Thu Oct 26 2023 22:50:51 GMT+0800 (China Standard Time)

It must take longer than 60s to pull the data from all the certs. The default is proxy_read_timeout 60s; https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout

You can set the timeout to 5 minutes to test and see if that works any better, it would confirm the issue is related to nginx proxy timeout setting.

In /etc/nginx/conf.d/local.conf add proxy_read_timeout 300s; to the PRIMARY_DOMAIN's server block location /admin/ section. and restart nginx

...
server {
    server_name <PRIMARY_DOMAIN>;
    ...
    location /admin/ {
    ...
    proxy_read_timeout 300s;
    }
    ...
}

yeah · Answer 4 · Thu Oct 26 2023 23:11:47 GMT+0800 (China Standard Time)

I've done this and can confirm that increasing the nginx timeout will resolve the issue. It's not a real fix though. 2 years down the line, or with a couple more domains/certs, the timeout will eventually be have to increased to 10 minutes, etc.

I've debugged management/ssl_certificates.py and have pointed to where the core issue is. It's an easy fix.

IMHO, increasing the nginx timeout is just a bandaid, or more water in a leaking bucket so to speak :-)

yeah · Answer 5 · Mon Jul 22 2024 16:05:53 GMT+0800 (China Standard Time)

Reminder to ideally revert de5a060 / #2407 as this issue is closed.