fluxcd / flux

Successor: https://github.com/fluxcd/flux2

Home Page:https://fluxcd.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for AWS ECR?

hartmut-pq opened this issue · comments

Hi,

is there support for AWS ECR? Couldn't really find documentation or figure out how to get it working...
ECR requires to generate an authorization token which changes every 12h.

Because the Docker CLI does not support the standard AWS authentication methods, you must authenticate your Docker client another way so that Amazon ECR knows who is requesting to push or pull an image. If you are using the Docker CLI, then use the docker login command to authenticate to an Amazon ECR registry with an authorization token that is provided by Amazon ECR and is valid for 12 hours.

http://docs.aws.amazon.com/AmazonECR/latest/userguide/Registries.html#registry_auth
http://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html

is there support for AWS ECR?

Not at the moment. We would love to support ECR, but haven't yet.

ECR requires to generate an authorization token which changes every 12h.

That's probably the biggest issue that we would need to work on, but in the meantime you can solve it by periodically re-generating config and then running fluxctl set-cofing -f flux-ecr-generated.confg.

To generate config, I would start with:

eval "$(aws ecr get-login)"
printf "registry: {auths: %s}" "$(jq -c .auths ~/.docker/config.json)"

How the above makes sense, I'd be happy to help if you jump on Slack and we can figure out how to update the docs and talk more about how we could make it easier for you.

Hi,

I gave it a go but run into following issues (AWS ID & repo names are fake...):

$ fluxctl list-images
Error: 500 Internal Server Error getting images for services: fetching image metadata for 123456789.dkr.ecr.eu-west-1.amazonaws.com/xyz: Get https://123456789.dkr.ecr.eu-west-1.amazonaws.com/v2/xyz/tags/list: http: non-successful response (status=401 body="Not Authorized\n")

It's definitely a valid login, the xyz repo name was fetched from aws ecr - so seems to work.
I noticed following different error with -n 10 added:

$ fluxctl list-images -n 10
Error: 500 Internal Server Error getting images for services: fetching image metadata for 123456789.dkr.ecr.eu-west-1.amazonaws.com/xyz: Get https://index.docker.io/v2/123456789.dkr.ecr.eu-west-1.amazonaws.com/xyz/tags/list: http: non-successful response (status=401 body="{\"errors\":[{\"code\":\"UNAUTHORIZED\",\"message\":\"authentication required\",\"detail\":[{\"Type\":\"repository\",\"Class\":\"\",\"Name\":\"123456789.dkr.ecr.eu-west-1.amazonaws.com/xyz\",\"Action\":\"pull\"}]}]}\n")

Seems to put together wrong urls.. trying to accidentally access docker.io under the hood?
-> https://index.docker.io/v2/123456789.dkr.ecr.eu-west-1.amazonaws.com/xyz/tags/list

@hartmut-pq could you please try the following:

  1. refresh the token by re-running the commands I showed earlier
  2. edit the config and make sure only *.amazonaws.com keys are present in auths object
  3. check actual keys (based on what you shown above, the correct key should be 123456789.dkr.ecr.eu-west-1.amazonaws.com

As I said earlier, you are welcome to login to Weaveworks Slack and ping me (@ilya).

NB this work around will no longer work, since the credentials are needed by the daemon rather than the service. We'll have to build something into the flux daemon if we can -- as a stop-gap, it may be possible to add a sidecar container that does the credentials update (though we'd probably have to build in a flag to override where to look for credentials).

Does k8s (specifically kubelet) work with ECR? How?

Answering my own question: yes, it does.

Just to clarify, ECR is still not supported in flux right?

The error, for fellow debuggers is:

auth={map[]} err="requesting tags: Get https://.dkr.ecr.us-east-1.amazonaws.com/v2//tags/list: no basic auth credentials"

Correct. Hence this issue is still open.

For those affected by this limitation, check out this tool that lets you use imagePullSecrets for AWS credentials. I was able to work around my issue by using it.

@corcoran How did you get this to work? I still get the "no basic auth credentials" error even though I'm using registry-creds and injecting the imagePullSecrets into the flux service account.

Flux doesn't try to access image secrets attached to its service account (maybe it should; but I'm not sure they are accessible to the processes inside the pod).

However, #1065 could help. If you have a sidecar that can put the image creds into a volume, you can use --docker-config to tell flux to look there for credentials. @errordeveloper Want to try doing a worked example?

Here's a proof of concept I came up with for anybody who wants a workaround until this is natively supported. I'm just trialing a new deployment pipeline for my team's clusters so I want to iterate fast rather than fix the Go code right now.

https://github.com/mwhittington21/flux-ecr-example

Follow the instructions there and hopefully you can get it working. Thanks for all the tips above, helped me out a lot.

@mwhittington21 Nice! Looks like a good approach, thank you for posting here 💯

@mwhittington21 Your sidecar example works for me except on the refresh. A new cred file is loaded correctly into the flux container, but it doesn't seem like flux picks up the new credentials in the file?

Yeah unfortunately there is no way to get Flux to refresh the creds. I've moved to having a bash script that will kill Flux once every ~8 hours that runs as the entrypoint for the pod for now. It's not very pretty.

I would like to make a PR to add a Flux endpoint that will allow refresh of creds from file which would allow the sidecar to signal Flux to reload the config. Either that, or Flux could re-read the file after an auth error.

@tornadical you can see an example of this approach on the kill-every-8-hours-tmp-fix branch.

@mwhittington21 If flux just read the file every time it needed it, would that fix this problem?

@squaremo It looks that way from my perspective. The new cred file is loaded correctly each refresh into the flux container.

#1230 makes it so that the credentials are read each time they're requested, rather than just at the start. If anyone is in a position to try this out from a local build, I'd appreciate feedback.

@squaremo Any guidance for someone who might want to submit a PR to build ECR support into Flux?

@squaremo Any guidance for someone who might want to submit a PR to build ECR support into Flux?

What kind of guidance are you after?

I have no particular design in my head for this. But I can pre-judge some approaches that might come up:

  • if it would involve baking an AWS CLI binary into the image, it's better done as a separate image that can be run as a sidecar (but perhaps this already exists? see the thread above)
  • if it's as simple as the GCP hack, then it could reasonably live alongside it

Hey guys,

as I was also facing that issue. I came with a simplified solution that works well for me. So I took away the part #539 (comment) and adjusted it slightly by mounting a generated json file directly from the outside and pass it as argument to fluxd. So my question is would it be possible to add the docker.config parameter to the helm chart? Than you only need to generate the config-mapping or secret.

Cheers,
Thomas

Hi guys, I created ecr-k8s-secret-creator which I have deployed as a kubernetes pod. It automatically creates or updates my Flux Pod's mounted kubernetes Secret. This Secret contains my ECR docker config.json.

The Secret's format is:

{
 "metadata": {
  "name": "${SECRET_NAME}",
 },
 "data": {
  "config.json": "${BASE64_ENCODED_DOCKER_AUTH_CONFIG}"
 }
}

Thanks to #1065 for making this possible. 🎉

if it would involve baking an AWS CLI binary into the image, it's better done as a separate image that can be run as a sidecar (but perhaps this already exists? see the thread above)

No, I don't think binary needs to be used.

if it's as simple as the GCP hack, then it could reasonably live alongside it

I think it's similar, but using AWS SDK may be more convenient then net/http+encoding/json (what GCP cousins uses). I cannot think of a good reason to not use the SDK.

We would need to do something roughly alike what @bzon's code does:

https://github.com/bzon/ecr-k8s-secret-creator/blob/master/main.go#L52-L69

Am I understanding correctly that GetGCPOauthToken() gets called every time we encounter an image with gcr.io prefix and want to fetch it, so there is no token refresh loop as per-se? I'd imagine we can do a similar thing for ECR. Although, token refresh loop would be a reasonable optimisation to consider.

Personally, I think it's far far better to have similar kind of setup for all registers. If am not mistaking, Azure registry would require a similar kind of thing. I'd like to note that making registries accessible outside of their native cloud is also a legitimate use-case, but hopefully we can assume it's not too important just yet.

Great intel, thanks @errordeveloper. Yeah, we could start with just getting a token when we need it, and later try to refresh it on a schedule.

@jml ^ Ilya has answered your question better than I could :-)

@errordeveloper Thank you very much! I'm using your project and Flux is now able to detect new images pushed to ECR and trigger the automatic upgrade, good work!

So is ECR supported or not?

@michalschott I think that ECR is not natively supported by Flux yet. But, if you deploy the third-party project https://github.com/bzon/ecr-k8s-secret-creator, it works like a charm. Thanks to this project, we are using Flux with ECR without any issues.

I followed the solution suggested by @bzon - that is, deploying the pod ecr-k8s-secret-creator in conjunction with editing Weave Flux deployment via kubectl edit deployment weave-flux-agent and using the created secret as a docker volume in the Flux pod along with adding --docker-config argument to the container. But, I notice that the pod spec for the agent keeps getting updated automatically which results in the removal of the --docker-config flag. The other edits I made such as mounting the secret created ecr-k8s-secret-creator as a volume appear to stick. What is causing this and what is the solution?