brandonweiss / discharge

⚡️ A simple, easy way to deploy static websites to Amazon S3.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Configurable ACL

clarencesong opened this issue · comments

Current this is set to ACL: "public-read". Would it be possible to make this configurable so that it can be omitted for deployments which do not need public-read to be set? (For example, websites that should only be accessible over HTTPS via CloudFront but not directly from S3.)

Hmm. 🤔

It’s been awhile since I pored over the AWS docs to figure out how to wire all this together, but my vague memory is that setting it to public-read was necessary in order for CloudFront to work? I could totally be wrong though! Can you point me to the docs on this that say it doesn’t need to be set to public-read?

Assuming I'm wrong, then we should probably dynamically decide what ACL to set. I have an aversion to having any more configuration options than are absolutely necessary. So for example without a CDN we use public-read and with a CDN we use private (or whatever the correct value is). I can imagine how this would actually be a lot more complicated to implement than it appears. Mainly because locking down the S3 bucket is something that can really only be done after you’ve properly configured everything. For example, if you deploy your site without a CDN, then come back weeks later and decide you want to turn it on, we could set the bucket to private after enabling the CDN, but that would actually break your site, because you hadn't updated your DNS to point to CloudFront instead of S3 yet. I think there are a few scenarios where if we attempted to lock down the bucket for you it might result in a situation you don't expect. Which makes it seem like it does need to be a configuration option.

If it has to be configured manually then I would want it to be something that is really necessary. Do you think it is? That is, when using CloudFront, is setting the bucket to private just more technically correct, or does it actually serve a real, necessary purpose? Is it a security concern? I can actually imagine how you might want it to be public even with a CDN. I've occasionally run into issues where the CDN didn't appear to bust the cache and being able to check S3 to make sure it was updated was helpful.

Let me know your thoughts!

I tried looking for documentation that explcitly confirms or denies that public-read is necessary/unnecessary for CloudFront but haven't been able to find any. The closest that I've come across that alludes that it isn't needed is this page which recomemnds using an Origin Access Identity.

My CloudFront distribution is set up in this manner, with "Restrict Bucket Access" turned on, and the distribution's Origin Access Identity being used by S3 to grant access to the bucket.

I've validated that this setup works by uploading a new/uncached .html file to S3 manually:

  1. Ensure that "Read object" for "Group: Everyone" is not "Yes" under the object's "Permissions" tab > "Public access"
  2. Access the file directly from the S3 bucket URL, which returns "403 Forbidden" with a "Code: AccessDenied" error
  3. Access the file from the CloudFront URL, which returns the file content successfully

I believe the permission in (1) is set to "Yes" when Discharge uploads files with the ACL: "public-read" option, causing the file to be accessible from the S3 bucket URL directly, while not affecting CloudFront access.

It seems that an object's ACL is given a higher priority, and would override the bucket's policy.

You're right that there are many scenarios where locking down the S3 bucket isn't ideal, so setting it dynamically would cause problems. Also, some deployments may not be sensitive to preventing users from accessing the site via a S3 URL, but others may want to prevent that due to more sensitive content since S3 website URLs are guessable.

Thinking along the lines of granularity, using object ACLs to lock down a bucket may be a little too granular, and would cause further issues downstream when new objects are uploaded to S3 with different ACLs, making the entire deployment difficult to manage. According to AWS's guidelines, object ACLs would make sense when "permissions vary by object and you need to manage permissions at the object level". (Unfortunately, the section about "bucket ACL" says that a bucket ACL is only recommended for logging purposes. Befuddling!)

If controlling access to an entire bucket should be done at the bucket level, supporting what CloudFront automates during the setup process with the "Restrict Bucket Access" option (like in the tutorial linked above) would be a good way to go, and we'll need to prevent ACL: "public-read" from being set on the individual objects and overriding the policies set up during the process.

Other ways I've come across would be to use a WAF to restrict by IPs or to get S3 to check a HTTP header sent by CloudFront, both of which would require extra work to set up.

Hmm, OK, so it sounds like I was wrong about the ACL needing to be public-read in order to use CloudFront.

I think there are two things I don't quite understand, though.

What is the scenario where you want to restrict direct read access to the bucket? Is it more for “cleanliness” or is there a concrete security concern?

What would be your suggestion about how to modify discharge to allow locking down the bucket when using a CDN?

For hosting an internal website that should be seen only by people who are authorized to. There doesn't seem to be an easy way to limit access to a S3-hosted website. With CloudFront, there are many ways to do it, like:

Unfortunately, I've only used Discharge to deploy to S3, while CloudFront is manually configured due to some custom domain requirements, so I haven't had enough experience with the CDN portion of Discharge.

Just a suggestion, thinking along the lines of providing an easy way to configure this: Perhaps a setting like restrictBucketAccess, which sets this up for users across both the S3 & CDN features of Discharge, would be more in line with the spirit of the project than one that overrides just the ACL param during S3 upload?

(I've tested the HTTP header check method I suggested earlier, but unfortunately it didn't work. public-read at the object level is too powerful to override.)

@clarencesong Sorry, I missed the notification for your comment!

Ah, yeah, I think what you want to do is probably outside the scope of this project. As I understand it, if it’s even possible to do this in a way that works with the various permutations of how someone should be able to use discharge, I think it would add too much programatic and conceptual complexity for a feature that doesn’t seem like it would be used very much.

That said, this is admittedly difficult to talk about and imagine in the abstract, so if you think I’m wrong or misunderstanding please explain it to me! Or if you think a PR would be clearer that works too! But I want to be upfront that I’m pretty judicious about adding complexity to projects. Thanks!

Sure, thanks for considering! I'll close this issue.