MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure

Home Page:https://docs.microsoft.com/azure

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

agentkeepalive

tony-gutierrez opened this issue · comments

The max sockets setting for agentkeepalive is PER HOST, so the example doesn't make much sense. Also there is not explanation as to where the recommended 160 sockets per VM number comes from.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@tony-gutierrez Thanks for the feedback! We are currently investigating and will update you shortly.

@tony-gutierrez Each individual VM hosting the applition is limited to 160 SNAT sockets, which is where this number comes from. This is an intentional platform design and I am not aware of any upcoming changes to this quota. In regards to the sample provided, I have requested the doc author to investigate this and provide an update an necessary.

@rramachand21 Can you please review the feedback about the agentkeepalive and update the doc as necessary? Thank you

@Tysonn I am having issues assigning the doc author to this issue. The doc lists ranjithr but the peoples site lists rramachand21 as the GitHub profile. Neither profile appears under the assignees menu. Can you please provide input how to proceed?

The 160 is only a pre-allocation, not a limit.
https://www.theregister.co.uk/2018/02/27/microsoft_rewrites_source_network_address_translation/
https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-outbound-connections

I am experiencing port exhaustion in almost all my azure node instances. Using keep alive (native or the recommended library) helps, but I have the situation of many connections to few hosts as described here: https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-outbound-connections#pat

I am having a really hard time finding the amount of sockets to use that gives me decent performance and avoids timeouts. This has never been an issue for me on any AWS vms, even without keepalive. I have never experienced "port exhaustion" until trying to deploy node on azure.

Also would be helpful if there was a way to inspect a given vm to see if it was using the old 160 or the new 1024.

Hi Tony, Just FYI, its never 1,024 because our pool is larger. And we are reverting back to the old 160 preallocation value. So the document is still relevant and up to date.

You might want to stop recommending that module, and just recommend using native node keepalive. The options are almost identical, and that module might have a race condition. Native will probably always be faster as well. I replaced the module with native with good results, other than the whole 160 issue.

@tony-gutierrez We will now proceed to close this thread. If there are further questions regarding this matter, please reopen it and tag me in your reply. We will gladly continue the discussion.