S3 bucket names includes the customer ID which can contain invalid characters

Question

S3 bucket names includes the customer ID which can contain invalid characters

rossf7 opened this issue 7 years ago · comments

The bucket names we generate include the customer ID. The customer ID can include underscores which are invalid for S3 bucket names. We should validate the customer ID and / or change the bucket name format to prevent this error.

https://github.com/giantswarm/aws-operator/blob/master/service/create/bucket_names.go#L9
http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html

Puja · Answer 1 · Mon Jun 12 2017 18:01:15 GMT+0800 (China Standard Time)

The question is why customerID (or even organizationID) needed at all for S3 bucket names, if there is a need for a unique ID it should be the clusterID as the S3 bucket pertains to the cluster, doesn't it?

cc @marians

Marian Steinbach · Answer 2 · Mon Jun 12 2017 18:27:05 GMT+0800 (China Standard Time)

Please also be aware that the cluster's owner organization is changeable. See PATCH /v4/clusters/{cluster_id}/ (here).

Given that, encoding the owner org name (ID) into the bucket name is risky.

Ross Fairbanks · Answer 3 · Tue Jun 13 2017 18:48:42 GMT+0800 (China Standard Time)

The S3 bucket is used for storing the encrypted cloudconfigs. Currently, the bucket name includes the customer ID and a dir is created per cluster ID.

We could use the clusterID in the bucket name. We'd need to make sure the bucket is deleted with the cluster. FYI the AWS default limit is 100 buckets but we could increase that.

@nhlfr @asymmetric WDYT?

Oliver Ponder · Answer 4 · Wed Jun 14 2017 17:31:02 GMT+0800 (China Standard Time)

+1 to increasing the bucket limit and using cluster-id for the bucket name.

asymmetric · Answer 5 · Wed Jun 14 2017 18:52:06 GMT+0800 (China Standard Time)

As long as the resulting bucket name is reasonably unique (unique across all of S3), I think using the clusterID sounds good.

Do we then need a subdirectory? I think not, since the clusterID is specific enough. And yes, we need to start deleting buckets when we delete the cluster.

So to summarize, the name would be:

AWSID-CLUSTERID-REGION.

Ross Fairbanks · Answer 6 · Mon Jun 19 2017 16:13:17 GMT+0800 (China Standard Time)

I think the new format for the bucket name is good. I don't think we need a subdirectory.

asymmetric · Answer 7 · Tue Jun 20 2017 21:10:06 GMT+0800 (China Standard Time)

I'll take this because it makes #322 easier. Without this, we would need to calculate the hash of the CC when deleting, which is ugly.

After this, we just delete the whole bucket, so we don't need to care about the BucketObjects