aws / aws-cdk-rfcs

RFCs for the AWS CDK

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Service Catalog ProductStack Asset Support

wanjacki opened this issue · comments

Description

The feature is an improvement to the existing ProductStack construct to add support for the use of Asset files.

Use Case

I'm always frustrated as a Service Catalog administrator when I try to add a Lambda function to my ProductStack in CDK because I want to reference my Lambda code from an asset file, and CDK throws an error when I attempt to synthesize this. This limitation means that I'm unable to make use of ProductStack when I want to create a Service Catalog product consisting of Lambas that run large amounts of code. This is an example of a product which I would like to deploy to Service Catalog and share with end users across AWS accounts:

class ServerlessProduct extends sc.ProductStack {
  constructor(scope: cdk.Construct, id: string) {
    super(scope, id);

    // Defines an AWS Lambda resource
    const myHandler = new lambda.Function(this, 'Handler', {
      runtime: lambda.Runtime.NODEJS_14_X,
      code: lambda.Code.fromAsset(path.join(__dirname, 'handler')),
      handler: 'index.handler'
    });

    // Defines an API Gateway REST API resource backed by the handler function
    new apigw.LambdaRestApi(this, 'Endpoint', {
      handler: myHandler
    });
  }
}

When cdk synth is called, it will throw: Service Catalog Product Stacks cannot use Assets

##Proposed Solution

Currently, CDK vends an asset bucket during bootstrap-time to the customer's AWS account. This bucket can be used successfully for enabling file asset support in ProductStack with CFN outputs from the parent stack for both the S3 bucket name and object key. The major issue with this approach is that when sharing a Service Catalog portfolio across accounts, a product that makes use of file assets cannot be provisioned since the parent stack with the aforementioned outputs does not exist in the end-user account.

To solve this, we could implement the usage of a bespoke S3 Bucket to contain asset files from assets used in a Service Catalog ProductStack. A bespoke bucket for this use case allows us to control the naming of the bucket as well as its permissions. Controlling the bucket name is important, especially at synth-time, since this will be referenced by resources that use assets, such as a Lambda function which references Python code stored in an asset file in the S3 bucket. Controlling permissions on a bucket which contains assets is important within the framework of Service Catalog since the administrator of a Service Catalog portfolio shares this portfolio across AWS accounts with end users who make use of products which reference asset files.

In order to implement this solution, we make use of S3 Deployments. S3 Deployments will essentially copy the assets from the bootstrap asset bucket to our own S3 Bucket enabling us to control the naming and permissions on the bucket.

Users can start using assets in their ProductStack without doing anything in which case a ProductStackAssetBucket (extends S3 Bucket) will be automatically generated for them for each ProductStack and named based on their accountId and region. Their assets will then be copied over to this bucket using S3 Deployment. The limitation here is that each ProductStack will generate its own Bucket.

Alternatively, we understand that creating a Bucket for each ProductStack may not be the ideal user experience but due to limitations we are unable to create them on a Portfolio level. As a workaround, we also allow users to define their own ProductStackAssetBucket and pass this property to their ProductStack. This will allow users to define a single bucket and use it for multiple ProductStacks. The only requirement is that user must define a bucketName for their bucket. This is required as we need the bucketName to be written to the template generated by the ProductStack.

Finally, when the user adds their ProductStack to a Product, the Product will become aware of the asset bucket(s). When they
associate their Product to a Portfolio, the Portfolio will also become aware of the asset bucket(s). When they share their Portfolio with an end user account, the asset bucket(s) will grant read permissions to the end user account, so they will be able to access these assets when they provision their Product.

Draft PR:

A draft PR has already been made of this implementation here. They recommended me to go through the RFC process.
aws/aws-cdk#22143

Updated PR:

aws/aws-cdk#22857

Roles

Role User
Proposed by @wanjacki
Author(s) @wanjacki, @mackalex
API Bar Raiser @corymhall
Stakeholders @alias, @alias, @alias

See RFC Process for details

Workflow

  • Tracking issue created (label: status/proposed)
  • API bar raiser assigned (ping us at #aws-cdk-rfcs if needed)
  • Kick off meeting
  • RFC pull request submitted (label: status/review)
  • Community reach out (via Slack and/or Twitter)
  • API signed-off (label api-approved applied to pull request)
  • Final comments period (label: status/final-comments-period)
  • Approved and merged (label: status/approved)
  • Execution plan submitted (label: status/planning)
  • Plan approved and merged (label: status/implementing)
  • Implementation complete (label: status/done)

Author is responsible to progress the RFC according to this checklist, and
apply the relevant labels to this issue so that the RFC table in README gets
updated.

As one of the voices of the community, I would like to say that many teams working with CDK + ServiceCatalog have been waiting for this feature for a long time. My team discovered ProductStack over a year ago, and it was a game changer for us, allowing us to move away from all sorts of copying CFN outputs + assets around. The lack of support for assets is a technical debt we are looking forward to resolve.
On design considerations, we realize that these are topics for deeper discussion, but it's also worth noting that ProductStack without asset provisioning capabilities misses the point in most cases, since every major application has at least some complex/large bundled lambdas.
We would love to see this problem solved in the near future.

Hi @wanjacki! Thanks for submitting this! The goal of the RFC process is to describe the feature, with code samples, in user-facing language. (Effectively, what would go into the README)

Follow that up with design decisions that will come from looking at the code. For example:

  • Why can't a Portfolio automatically have a single bucket, again? (We'd rather have it be)
  • Why must ProductStackAssetBucket be its own class? (We'd rather just have a normal bucket I'd think)

Finally: what are negative side effects? What are the things we are locked into for the future once we decide to do this?

@wanjacki I've added myself as the API bar raiser. You can reach out on Slack and we can setup the kick off meeting.

@rix0rrr

  • Why can't a Portfolio automatically have a single bucket, again? (We'd rather have it be)

  • The BucketName needs to be written to the CFN Template generated by the Product Stack when the Product Stack is synthesized.
    If Portfolio automatically had a single bucket, the Product Stack does not know which Portfolio it will end up in at synthesize time.
    Even if this were possible, what would happen if the Product Stack is added to two different Portfolios (we can't write two buckets to the template).

The other argument we get is what if the multiple product stack share a bucket.
Well in this case we don't know if these Product Stacks will end up in the same Portfolio, so we might end up unintentionally sharing assets from a Product Stack not added to the Portfolio, when the entire Portfolio is shared.

Thus the conclusion we came up with is that an asset belongs to the Product Stack not to the Portfolio, thus the buckets that contain the asset also belongs to the Product Stack.
We will only generate one bucket per Product Stack to ensure no assets are unintentioanlly shared.

As a workaround, we allow users to pass their own Bucket so they can have one bucket for multiple Product Stacks, so that they can use a single bucket for all Product Stacks they know will go in a specific Portfolio.
If they want to use one bucket for everything, that is fine too, the user is in control of their own buckets and they should understand the risk that unintentional assets will be shared.

  • Why must ProductStackAssetBucket be its own class? (We'd rather just have a normal bucket I'd think)

  • ProductStackAssetBucket use to contain more logic such and restrictions but alot of that has been removed after the suggestion to extend Bucket.
    It only currently contains two things, a check that the BucketName is specificied (BucketName must be specified to write to template) and code to accumulate assets and deploy all the assets at once in a single BucketDeployment in the event a Product Stack contains multiple assets. This isolates logic away from the current Product Stack code, but I am not against adding everything to the Product Stack itself and use a regular bucket if that is what is recommended.
    Although it might be tricky to check the BucketName is specified because when its not specified (I think a token is generated and will end up being passed as an import to the template which would not work when shared). I think we lose direct access to the props when a regular bucket is passed, so we can't check the name but I'm not sure. Would have to dive a bit more into it but it should be managable.

Finally: what are negative side effects? What are the things we are locked into for the future once we decide to do this?

I guess the biggest thing is that we are locked into supporting assets in Product Stack. There is no really clean way to support this feature as we have resorted to S3Deployment to accomplish this, which itself may seem a bit hacky but every other solution is either just not possible or would not provide a good user experience. In the future if there is a better solution, we might still have to support this feature to ensure backwards compatibility. I don't know if this will set a precident for other teams with similiar use cases, but this is a highly requested feature for our customers and our team will be taking on the responsibility of maintaining this feature. I don't see any other major risk, but feel free to clarify if you spot any.

Takeaways from our meeting.

  • Add an optional prop for the asset bucket
    • If the user wants to use assets they must provide the bucket
    • The bucket can either be an owned bucket or a referenced bucket (i.e. Bucket.fromBucketName)
      • If it is a referenced bucket then don't add a bucket policy and warn the user that they must make sure the bucket
        has the correct permissions.
    • Check if the bucket has a bucketName set and fail if it does not.
  • Document the process for setting up cross account access to the bucket. The admin creates the product/portfolio and the asset bucket and then the user that provisions the product needs to have access to that bucket. How is that access setup? Is it a manually process or can it be automated at all?
    • What if the user uses the launch role?
    • What if the user uses their own role?
    • Can we just give blanket access? Provisioning account can read from all buckets in the product account?

@corymhall @rix0rrr
Here is the new PR for the RFC: aws/aws-cdk#22857

I opted not to give too much guidance on permissions and how user can setup everything as it is all standard Service Catalog setup and there is already documentation on this from Service Catalog. Admins currently already need to setup permissions for the resources they provision, so I think documentation saying they need to add read permissions to our asset bucket as well is sufficient. How they want to setup the permissions (blanket or restricted) should be responsibility of the Admin.
Everything else is implemented as discussed in RFC kickoff meeting.

Please add at least one working example to the README (permissions wise), and refer users to the existing documentation for more details.

@wanjacki just to reiterate, this is what we need at a minimum to accept this into the CDK.

  1. Either a completely automated set, or complete documentation for setting this up manually. If the documentation doesn't yet exist, then this might be a good opportunity to create it! Someone who has never used service catalog should be able to look at the CDK docs and have all the info they need to make it work.

  2. We need a complete integration test where a product is created with assets and then is provisioned, otherwise how do we know that it works?

You also have the option of creating your own construct (we can create a repo for you on cdklabs) if you do not want to follow the CDK contributing process. Let us know!

@rix0rrr @corymhall
Updated the ReadMe to include a sample policy and links to another section in the README for launch roles and a link to Service Catalog Documentation for launch roles. I believe all the documentation is already there and even someone who has never used service catalog will be able to figure it out.

We need a complete integration test where a product is created with assets and then is provisioned, otherwise how do we know that it works?

I thought we already discussed this, this is not one of the points of our takeaway from our meeting as shown in the notes above.
I am going to push back on this, there is no precedent to have this as in why wasn't this a requirement when we pushed the originally Product L2 Constructs, there is no guarantee those products created by CDK are provisionable.
The L2 Constructs create Products and Portfolios and corresponding S3 Bucket with Asset and permissions are Constructs used by the Admin not the End User. The experience is Admin uses CDK to create Products/Portfolios and share them. Enduser on their accounts (I suppose they can use CDK for this, but unlikely) will import that portfolio and provision it.
It not our responsibility to test if it is provisionable, none of our L2 Constructs actually provision the product and no one using Service Catalog in CDK is going to provision the product in the same deployment as they create the product and portfolio.

Despite that I attempted to use the underlying CfnCloudFormationProvisionedProduct resource and could not get it to work. It is not able to find the product. This resource gets created by CFN before Portfolio/Product are fully done, I tried adding a dependency, it does not work, as I think there is a delay before the product is visable. Even if we get pass that the product is might not be visable to the role running the integration test, unless we can specify and add this role to the portfolio. This also doesn't test if assets work cross account either which is the main use case. As it both doesn't make sense or work to test this in one deployment (It is probably possible, but I don't see how to do it.), we could try to have two deployments somehow.

I already spent a good amount of time on this requirement, so I don't want to waste anymore time. If you are insistent on this, then lets setup a meeting and discuss what we can do and what are the solutions as I am out of options. If its not possible to push this through without this, then lets discuss the alternatives . A lot of customers are asking for this feature, and if this is blocking it , then there isn't much more I can do on my own on my side in regards to this requirement.

You also have the option of creating your own construct (we can create a repo for you on cdklabs) if you do not want to follow the CDK contributing process. Let us know!
Can you elaborate more on how this will work and how this will impact the experience for our customers to use Service Catalog in CDK.

Thank you!

We need a complete integration test where a product is created with assets and then is provisioned, otherwise how do we know that it works?

I thought we already discussed this, this is not one of the points of our takeaway from our meeting as shown in the notes above.

I guess I forgot to include it in the notes, definitely remember discussing it and insisting on that as a requirement.

The experience is Admin uses CDK to create Products/Portfolios and share them...It not our responsibility to test if it is provisionable,

It 100% is our responsibility to test that it is provisionable with assets. We may not need to test that a standard product can be provisioned since that functionality exists inside Service Catalog. Assets is something managed completely outside of service catalog. You may have setup access within Service Catalog correctly and your product could fail to provision because of an S3 access denied error message. For example, are you 100% sure (I am not) that the permissions that you added to the README are the only permissions that are needed to deploy? And is that the best we can do with permissions (read from all S3 buckets)? I know we talked about looking into ways of scoping this down more.

It might not be possible to completely automate the integration test, and that is completely fine. We have several integration tests that require following manual steps.

@corymhall
Assets is something managed completely outside of service catalog.
So are Templates generated by our Product Stack (Product Stacks are a CDK specific concept and CFN Templates exist outside of Service Catalog). Those templates might not have been generated properly and are not provisionable but we we don't test this. Our current integration test, will create the asset, create a bucket, and use the underlying bucket deployment to copy the asset from CDK bootstrap bucket and pass that final bucket asset location to our template.

We may not need to test that a standard product can be provisioned since that functionality exists inside Service Catalog.
Just like ProductStack, everything has been setup to make sure it is provisionable but we don't test the actual provisioning, Service Catalog will not treat a template with an asset vs a template without an asset any differently when provisioning (it checks that you have permissions to every resource you specified in the template (including the reference to your asset bucket)

You may have setup access within Service Catalog correctly and your product could fail to provision because of an S3 access denied error message.
Once again it is up to the admin to make sure they have permissions for every resource they specify in their ProductStack/Template. They specified the asset and corresponding bucket, they should add permissions to avoid S3 access denied error

For example, are you 100% sure (I am not) that the permissions that you added to the README are the only permissions that are needed to deploy?
No they will likely need other permissions that are not specific to assets to deploy but its not up to us to determine that (although a lot of customers have asked us to scan their template and generate a minimal permission role for them but that's a seperate feature). We can't determine permissions as its different for each template, there might be generic permissions but these should already outlined in service catalog docs, I think theres even some existing managed policies created that they can add to their role.

For using assets specifically, they just need to provide access to the S3Bucket where assets are stored (that they provided).

And is that the best we can do with permissions (read from all S3 buckets)? I know we talked about looking into ways of scoping this down more.
We did but how we scope it down depends on how the Admin wants to scope it down. They can restrict it to that one bucket they provided, or multiple buckets. They can specify all buckets not in the account if they want but if they are using launch roles they don't need to do that since SC will assume a role on the Admin account to provision in that case so no other on their buckets are exposed anyways. There is no restriction in Service Catalog, if the Admin wants to give blanket access they are free to do so, I don't think this restriction should apply in CDK either. Basically I am saying I don't want suggest how the Admin should specify their permissions as that has been left up to them in Service Catalog.

It might not be possible to completely automate the integration test, and that is completely fine. We have several integration tests that require following manual steps.
Hey Cory, please link me to an example of those integration test with manual steps and I can see if its possible to implement provisioning in the same way, if you still believe it is needed.

Those templates might not have been generated properly and are not provisionable but we we don't test this.

Unless I am misunderstanding how the ProductStack works, the ProductStack construct does not actually define any resources within the stack. We do in fact have an integration test for what we control, but since the user is responsible for actually defining the stack that will be deployed that is not tested. If we were actually defined a ProductStack which contained some sort of infrastructure then we would 100% need to have an integration test that that could be provisioned, otherwise what good is it?

For using assets specifically, they just need to provide access to the S3Bucket where assets are stored (that they provided).

I guess to put it a different way, how do you know that the bucket policy that you are adding actually grants access? That's not something the user is deciding. If the bucket policy was not correct, we would receive a bug report which we would need to fix.

@corymhall
Unless I am misunderstanding how the ProductStack works, the ProductStack construct does not actually define any resources within the stack. We do in fact have an integration test for what we control, but since the user is responsible for actually defining the stack that will be deployed that is not tested. If we were actually defined a ProductStack which contained some sort of infrastructure then we would 100% need to have an integration test that that could be provisioned, otherwise what good is it?

This is how a user would use ProductStack

class HelloServerlessProduct extends sc.ProductStack {
  constructor(scope: Construct, id: string, props: ProductStackProps) {
    super(scope, id, props)

    // Defines an AWS Lambda resource
    const hello = new lambda.Function(this, 'HelloHandler', {
      runtime: lambda.Runtime.PYTHON_3_9,
      code: lambda.Code.fromAsset("./assets"),
      handler: 'index.handler'
    });
  }
}

User define their own resource just like they will also have to define their own assets. If they don't define any assets, then there's no additional infrastructure.

Users also define the Bucket to be used for assets, if they don't define this then there is no additional infrastructure.

  const testAssetBucket = new Bucket(this, 'testBucket', {
   bucketName: "testBucket"
 })

 const productStack  = new HelloServerlessProduct(this, 'HelloServerlessProduct', {
     assetBucket: testAssetBucket
   })

I guess to put it a different way, how do you know that the bucket policy that you are adding actually grants access? That's not something the user is deciding. If the bucket policy was not correct, we would receive a bug report which we would need to fix.

Not sure if i understand, but in the README I provided this policy, this is just a generic S3 Policy that allows GetObject on any resource, this is all they need to the best of my knowledge. I can test it more thoroughly. Any Role with this policy should be able to retrieve the asset from the bucket. The bucket allowing an account to read the asset is automatically granted when we share the portfolio.

{
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "*"
        }
    ]
}

Here is an example https://github.com/aws/aws-cdk/blob/main/packages/%40aws-cdk/aws-codedeploy/test/ecs/integ.deployment-group.ts

This seems doable, would I just need to add instructions to tell them to deploy the integration stack.
Then add instructions for them to run a CLI Command to provision the product then another CLI command to delete the provisioned product and then delete the stack. Is that sufficient?

@corymhall
Hey Cory I updated and tested/deployed the integ test with a manual step to provision the product (with asset).
Let me know what you think and if there are any other blockers and I will address them. Thanks.

@corymhall @rix0rrr
Any updates or other feedback on this PR?
#458