awsdocs / amazon-ecs-developer-guide

The open source version of the Amazon ECS developer guide. You can submit feedback & requests for changes by submitting issues in this repo or by making proposed changes & submitting a pull request.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Explain how ecs-agent and ecs-telemetry endpoints pertain to exfiltration concerns

copumpkin opened this issue · comments

First off, thank you for supporting PrivateLink! It makes ECS far more attractive to security-sensitive organizations like mine.

But one thing that struck me is that PrivateLink policies are supported on the ecs endpoint, but not on the ecs-agent and ecs-telemetry endpoints. The documentation just says I need all three, but doesn't really explain when and why (e.g., do I need all three for Fargate or just for traditional ECS?)

Unfortunately if I need all three endpoints in my VPC, then it makes me very uneasy to not have policy support on the latter two. I realize they're (probably deliberately) undocumented, but many security-conscious organizations are concerned about exfiltration risks from private VPCs that are not connected to the internet. Policy support on the ecs endpoint gives us the power to control exfiltration on that one, but the other two are sort of "wide open" to the world and make me very uneasy. It's possible that they're somehow restricted internally to only allow telemetry within my account but since they're undocumented it's hard to know.

tl;dr: could you explain a bit of how the ecs-agent and ecs-telemetry endpoints work from an exfiltration standpoint? PrivateLink's primary driver (as I understand it) is exfiltration-conscious offline VPCs so it seems relevant to the ECS PrivateLink documentation

Also, I ask specifically whether ecs-agent and ecs-telemetry are necessary for Fargate because I've been able to launch Fargate tasks in a private VPC without those endpoints, so I'm wondering if:

  1. They're now somehow necessary even for Fargate which didn't need them before
  2. Fargate doesn't strictly need them, but benefits from having them (perhaps by posting more information to CloudWatch or whatever when they're available?)
  3. Fargate doesn't need them at all and doesn't even care if they're there

While I'm asking about that page of documentation, let me add another related question:

To create the VPC endpoint for the Amazon ECS service, use the Creating an Interface Endpoint procedure in the Amazon VPC User Guide to create the following endpoints, in this order:

The "in this order" part confuses me. The endpoints appear to be independent of one another. Do I need to go into my CloudFormation and Terraform templates and introduce artificial dependencies between the endpoints to force the correct sequencing? Why is the order important?

We appreciate your patience while we worked through your questions over the holiday. I've put responses to each of your questions below, let us know if we can assist further.

Regarding PrivateLink Endpoint Policies:

We do not currently support endpoint policies on any of the three endpoints. There was a misconfiguration on our side that allowed an endpoint policy to be set on one of the endpoints that we have corrected. We apologize for the confusion. We are currently determining whether we can support endpoint policies in the future.

Regarding the need for one, some, or all endpoint configuration:

For Amazon EC2 instances running the ECS agents within a VPC, you would need all 3 endpoints in order to fully connect through PrivateLink. The "ecs" endpoint is our frontend API endpoint which hosts our normal APIs which would be used by both the CLI, SDK, etc. as well as what the ECS agent itself uses to signal various state changes in tasks.

The "ecs-agent" and "ecs-telemetry" endpoints represent the web-socket endpoints that the ECS agent requires in order to do various orchestration activities. Outside of PrivateLink, it would normally connect to external endpoints for this purpose.

Regarding Fargate:

Except for pulling images from Amazon ECR and other image repositories, Fargate tasks deal with our API endpoints differently; the managed activities do not happen within your VPC, and would be largely unaffected by your PrivateLink endpoints within your VPC.

Regarding endpoint ordering:

The order we state to create the endpoints in is encouraged due to the way the ECS agent behaves currently. If you do not have any EC2 instances running within a VPC prior to the endpoint creation, the order will not matter. The primary dependency you would want, if starting fresh, would be to have your Autoscaling groups or EC2 instances launches happen after the endpoints are created.

When the instance starts, it discovers its websocket endpoints (corresponding to "agent" and "telemetry" above). In order to discover that it should communicate over the PrivateLink endpoints, it needs to register and connect over the PrivateLink ECS API endpoint (in other words, through the "ecs" endpoint). With running instances, creating them in a different order will lead to some inconsistency until the agents are restarted or until a general time frame in which the ECS agent rediscovers and reconnects to its enpoints (which it does periodically).

@joelbrandenburg thanks for the detailed response, and the answers all make sense! I did notice that the policies on ECS had gone away and look forward to them coming back, since I (and many other customers in a similar boat) can't actually use the service meaningfully until policies are on (all) the endpoints. Or just the ECS endpoint if we only want Fargate 😄

Your answers also seem valuable to incorporate into the official documentation on the topic, since I doubt I'll be the only person wondering about this stuff.

Yes, we are in the process of updating the documentation with this information as well. Thanks for continuing to use Amazon ECS!

How are all the ECS PrivateLink endpoints useful without ECR? Basically from a private subnet, we can now talk to ECS ... but not ECR?!

commented

I'm using EC2 instances whose ecs-agents connect to ECS cluster. Apparently, I need to create 3 those endpoints. Isn't that too much for just the simple task? I just think they will cost me much more than just one endpoint.

Do I need to setup these three ecs endpoints when using AWS Batch with FARGATE? It is needed when I am using AWS Batch with ec2. when setting up the ecs endpoints resources needs to be "*" which is not allowed in our firm. Can I skip setting up this endpoints if I switch to AWS Batch on FARGATE.