cagov / ui-claim-tracker

This repo contains the Claim Status Tracker app, which helps Californians better understand what’s happening with their unemployment claim and benefits.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Eliminate SNAT port exhaustion under load

kalvinwang opened this issue · comments

Description

We’ve addressed SNAT port exhaustion in multiple ways (implementing keep-alive on API gateway and App Insights calls; disabling Live Monitor in all envs other than prod) and CDT addressed it by putting the Claim Tracker app on its own App Service plan away from BNSCN / last year’s app. We could fine-tune some of this, but I think the low-hanging fruit is gone. And under high loads, the number of SNAT ports needed is still far higher than what’s available, which results in pending and failed SNAT connections.

This Virtual Network NAT solution is the "Best" Microsoft recommended solution, and CDT offered to look into implementing it, so we should take them up on it. (I'm not certain that link is the right one because I don't know if an Azure public load balancer is what Azure Front Door uses for load balancing, but it should be something similar).

Virtual Network NAT simplifies outbound-only Internet connectivity for virtual networks. When configured on a subnet, all outbound connectivity uses your specified static public IP addresses. Outbound connectivity is possible without load balancer or public IP addresses directly attached to virtual machines. NAT is fully managed and highly resilient.

Using a NAT gateway is the best method for outbound connectivity. A NAT gateway is highly extensible, reliable, and doesn't have the same concerns of SNAT port exhaustion.

Acceptance Criteria

  • Confirm SNAT port exhaustion is eliminated under load

We have an existing example of singleton/static with our pino logger setup - I'll confirm that behavior persists across requests and investigate implementing the same for our HTTP agent instantiation

We are no longer seeing SNAT Port exhaustion in production:
SNAT Port Exhaustion report in Production showing no issues