Traefik redundancy and DNS configuration

Question

Traefik redundancy and DNS configuration

NReilingh opened this issue 2 years ago · comments

Hi @tiangolo -- this is a great guide and I'm enthusiastic about Docker Swarm as a better fit for places where Kubernetes is overkill. One thing that the guide doesn't go into tremendous detail on is DNS, and I have been confused about the specifics of redundancy, considering DNS does not actually provide for redundancy on its own as far as I can tell.

My understanding is this: At the end of the day, your DNS needs to point to the swarm nodes that Traefik is deployed on. If you have multiple Traefik nodes, you can round robin them in DNS to distribute load, but if one node fails, nothing is stopping DNS from continuing to resolve to that node, in proportion to the other IPs that are configured in the round-robin scheme. Thus, the redundancy of service distribution across the swarm doesn't translate to service availability for a client.

One thing that could help here is that it isn't strictly necessary to run Traefik on a manager node -- Traefik can access the Docker API from another host over TCP or SSH if made available.

In light of this, one possible way to increase reliability would be to factor out Traefik to not run in the Docker Swarm at all, and instead have a pair of separate hosts running Traefik in round robin, proxying traffic back to the swarm. This still doesn't protect you from one of those Traefik hosts failing, but since those hosts have only one job we would consider them to be extremely stable. So in effect, the strategy is to compensate for DNS's lack of redundancy with stability, by trading off on the flexibility and automation of running Traefik inside the swarm.

Curious to know your thoughts on this, or if you think I'm missing anything that I should consider. And thanks again for writing up the guide! It's a great resource.

Luis Al · Answer 1 · Tue Oct 25 2022 00:52:56 GMT+0800 (China Standard Time)

This might not fully address your concerns but something to consider when making use of separate nodes to access Traefik ingress would be to setup Keepalived for a Virtual IP targeting your swarm (wherever you expect Traefik to run)

Your DNS record should point to this IP; health checks via Keepalived handle HA and Traefik will Load Balance; hopefully this makes sense and doesn't further complicate your scenario.

Sebastián Ramírez · Answer 2 · Sun Dec 10 2023 21:46:32 GMT+0800 (China Standard Time)

Hello! Thanks for the post! I should let you know, that I had to deprecate this website and ideas, I would no longer recommend Docker Swarm Mode for new projects: https://dockerswarm.rocks/swarm-or-kubernetes/ 🥲

github-actions · Answer 3 · Thu Dec 21 2023 08:21:23 GMT+0800 (China Standard Time)

Assuming the original issue was solved, it will be automatically closed now. But feel free to add more comments or create new issues.