giantswarm / aws-operator

Manages Kubernetes clusters running on AWS (before Cluster API)

Home Page:https://www.giantswarm.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

S3 bucket names includes the customer ID which can contain invalid characters

rossf7 opened this issue · comments

The bucket names we generate include the customer ID. The customer ID can include underscores which are invalid for S3 bucket names. We should validate the customer ID and / or change the bucket name format to prevent this error.

https://github.com/giantswarm/aws-operator/blob/master/service/create/bucket_names.go#L9
http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html

commented

The question is why customerID (or even organizationID) needed at all for S3 bucket names, if there is a need for a unique ID it should be the clusterID as the S3 bucket pertains to the cluster, doesn't it?

cc @marians

Please also be aware that the cluster's owner organization is changeable. See PATCH /v4/clusters/{cluster_id}/ (here).

Given that, encoding the owner org name (ID) into the bucket name is risky.

The S3 bucket is used for storing the encrypted cloudconfigs. Currently, the bucket name includes the customer ID and a dir is created per cluster ID.

We could use the clusterID in the bucket name. We'd need to make sure the bucket is deleted with the cluster. FYI the AWS default limit is 100 buckets but we could increase that.

@nhlfr @asymmetric WDYT?

+1 to increasing the bucket limit and using cluster-id for the bucket name.

As long as the resulting bucket name is reasonably unique (unique across all of S3), I think using the clusterID sounds good.

Do we then need a subdirectory? I think not, since the clusterID is specific enough. And yes, we need to start deleting buckets when we delete the cluster.

So to summarize, the name would be:

AWSID-CLUSTERID-REGION.

I think the new format for the bucket name is good. I don't think we need a subdirectory.

I'll take this because it makes #322 easier. Without this, we would need to calculate the hash of the CC when deleting, which is ugly.

After this, we just delete the whole bucket, so we don't need to care about the BucketObjects