mail-in-a-box / mailinabox

Mail-in-a-Box helps individuals take back control of their email by defining a one-click, easy-to-deploy SMTP+everything else server: a mail server in a box.

Home Page:https://mailinabox.email/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clean up expired SSL certificates in order to prevent web timeouts

yeah opened this issue · comments

commented

We have 53 domains and are running MIAB since 2017. By now, our /home/user-data/ssl had grown to over 1.000 old expired certificate files.

Since get_ssl_certificates() from management/ssl_certificates.py is looping over all of these old certs, reading and analyzing them at least once per domain for most web requests of the management interface, we had a situation where the web interface had become unusable due to request timeouts (like in #1966). Another result of this was that DNSSEC RRSIG updates were not working properly anymore resulting in our domains not being resolved properly anymore by many DNS servers. (This is due to /etc/cron.daily/mailinabox-dnssec using ~/tools/dns_update which in turn makes web requests.

The simple make-it-work-now solution is to clean up /home/user-data/ssl for old files.

But there should be a mechanism in MIAB that removes obsolete certificates or at least disregards them when looping in get_ssl_certificates().

To a degree, that's probably already done, but likely the time it takes to go through all the certs still exceeds the proxy timeout. Like I mentioned in #1966, extending the proxy timeout might fix this. Or as you say deleting old certs. If that isn't added in ssl_certificates.py, something like find /home/user-data/ssl/*-*.pem -maxdepth 1 -mtime +365 -delete may work

I deleted old certs and that made the status panel appear again.

It must take longer than 60s to pull the data from all the certs. The default is proxy_read_timeout 60s; https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout

You can set the timeout to 5 minutes to test and see if that works any better, it would confirm the issue is related to nginx proxy timeout setting.

In /etc/nginx/conf.d/local.conf add proxy_read_timeout 300s; to the PRIMARY_DOMAIN's server block location /admin/ section. and restart nginx

...
server {
    server_name <PRIMARY_DOMAIN>;
    ...
    location /admin/ {
    ...
    proxy_read_timeout 300s;
    }
    ...
}
commented

I've done this and can confirm that increasing the nginx timeout will resolve the issue. It's not a real fix though. 2 years down the line, or with a couple more domains/certs, the timeout will eventually be have to increased to 10 minutes, etc.

I've debugged management/ssl_certificates.py and have pointed to where the core issue is. It's an easy fix.

IMHO, increasing the nginx timeout is just a bandaid, or more water in a leaking bucket so to speak :-)

commented

Reminder to ideally revert de5a060 / #2407 as this issue is closed.