Do not try to delete lambda@edge functions with replicas

Question

Do not try to delete lambda@edge functions with replicas

rarguelloF opened this issue 7 years ago · comments

Terraform fails to delete lambda@edge functions that were already replicated, as AWS just doesn't let it.

From http://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html:

When you create a trigger, Lambda replicates the function to AWS Regions and CloudFront edge locations around the globe. Note that replicas can't be edited or deleted.

Terraform Version

Terraform v0.10.6

Affected Resource(s)

aws_lambda_function

Expected Behavior

Terraform should not fail and should give you some kind of warning that it couldn't delete the resource because AWS just doesn't let you.

Actual Behavior

Terraform fails as it's trying to delete lambda function and AWS doesn't allow it.

Steps to Reproduce

Create some configuration file with a lambda@edge function.
Associate it to a cloudfront distribution and let it get replicated.
Remove the lambda from your configuration.
terraform apply

David Pinfold · Answer 1 · Fri Sep 29 2017 21:14:13 GMT+0800 (China Standard Time)

I'm having this problem also.

Chris Deigan · Answer 2 · Tue Oct 17 2017 10:17:31 GMT+0800 (China Standard Time)

I can't reproduce a crash in Terraform v0.10.7, but I do still hit a dead-end where Terraform is unable to delete the resource:

Error applying plan:

1 error(s) occurred:

* module.xxx.aws_lambda_function.yyy (destroy): 1 error(s) occurred:

* aws_lambda_function.yyy: Error deleting Lambda Function: InvalidParameterValueException: Lambda was unable to delete arn:aws:lambda:us-east-1:123456789123:function:zzz:2 because it is a replicated function.
	status code: 400, request id: ........-....-....-....-............

It would be useful for the aws_lambda_function resource to support a retain_on_delete parameter, similar to cloudfront_distribution so that the resource can be "forgotten" from the terraform state despite the limitation that AWS APIs won't actually let you remove it.

Yves M. · Answer 3 · Thu Nov 16 2017 19:19:31 GMT+0800 (China Standard Time)

You can delete a CF trigger, but apparently there is actually no way to delete replicated functions

Consequence: You can no longer delete the original function that was used as a CF trigger 😞

https://forums.aws.amazon.com/thread.jspa?threadID=260242&tstart=0

I reached out to my AWS contact during the Lambda@Edge preview, and he said they have no plans to support deleting replicated functions manually. But they're working on a system to automatically delete the replicated functions when you remove the trigger.

On another note, we can no longer delete the original function that was used as a CF trigger. Says "There was an error deleting your function: Lambda was unable to delete arn:aws:lambda:us-east-1:...:1 because it is a replicated function." I'd assume this is a bug in the system, since we need to be able to delete functions we created ourselves (aside from the replicated functions created automatically by CF.)

David Calhoun · Answer 4 · Wed Dec 13 2017 01:06:59 GMT+0800 (China Standard Time)

@ctd I believe the "crash" @rarguelloF describes is likely the error that is thrown. The error effectively prevents you from altering the Lambda function (e.g. renaming, deleting), as Terraform aborts any changes when failing to delete the original, replicated function.

For my project, I had to destroy and recreate the terraform.tfstate file in order to rename a function that was replicated by a CloudFront distribution. I agree with you that it would be nice for Terraform to at least support "forgetting" Lambda functions that are incapable of being deleted by the AWS API.

Rodrigo Argüello · Answer 5 · Wed Dec 13 2017 01:22:56 GMT+0800 (China Standard Time)

@ctd @dcalhoun replaced the word "crash" for "error" 👍

will Farrell · Answer 6 · Wed Dec 13 2017 03:04:14 GMT+0800 (China Standard Time)

Support reply regarding this on Nov 27, 2017

Hello,

Thank you for contacting Amazon Web Services.

Unfortunately this is an open issue which our team are looking to support. 

Today, Lambda Master Functions do not get deleted once the association and versions 
are deleted. This is something which the team is expecting to resolve very soon, at this 
moment unfortunately there is no option for customer or AWS team to delete these functions.

If you are running into any limit errors due to these stale functions do let me know, I 
will be happy to work with our teams to help increase the limits.

We appreciate your patience while this feature is made available.

Let me know if I could be of any further help on this case.

Best regards,

Dilip S | Sydney Support Center
Amazon Web Services

Any one have an idea what very soon means to AWS? 6-8m?

Lee Cookson · Answer 7 · Wed Dec 13 2017 23:37:57 GMT+0800 (China Standard Time)

If a Lambda@Edge can be used for other behaviors and/or other CloudFront distributions, it makes sense to me to put in it's own terrform project, and just refer to it, so that destroying a distribution or removing a Lambda trigger from a behavior shouldn't attempt to destroy the (reusable) Lambda function. Other than using arn, I don't see a way to associate a Lambda function with a CF behavior. A data object to refer to a lambda would be very useful here, and we could pull the latest version using variables on the data object. Also, Lambda@Edge triggers have to use specific versions, and can't use the $LATEST alias. Is there another resource we can use to allow a CloudFront distribution to be "stable" and not need a variable updated each time a new Lambda function is released.

Luiz Freneda · Answer 8 · Thu Dec 14 2017 09:22:47 GMT+0800 (China Standard Time)

Same problem here..

Oliver Heck · Answer 9 · Sat Jan 06 2018 15:32:12 GMT+0800 (China Standard Time)

-_- ... sometimes aws is killing me ;)

Scott Lindsey · Answer 10 · Sat Jan 13 2018 01:56:49 GMT+0800 (China Standard Time)

January 12, 2018. I just did my level best to find a way, any way, to delete a function that has been replicated. I'm guessing that someone inside of AWS closed the bug report not understanding how serious this issue is.

Keith Gable · Answer 11 · Sun Mar 18 2018 03:22:46 GMT+0800 (China Standard Time)

The doc moved here btw: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-edge-delete-replicas.html

Jarrod Bellmore · Answer 12 · Thu May 31 2018 04:06:14 GMT+0800 (China Standard Time)

I am also blocked by this issue unfortunately.

We are trying to implement automated tests using the kitchen terraform plugin and aws-spec and this issue completely blocks this due to the terraform destroy erroring if you have a lambda@edge function in your stack.

My first thought for getting around this issue was to remove the lambda@edge functionality from the module only when tests are run but I have been unable to figure out how to accomplish this. It seems impossible to do this since the lambda_function_association property can only be set using the interpolation syntax (lambda_function_association = [ "${conditional logic here}" ] ) and that doesn't appear to support passing in a list of map objects.

Is there some other way that I can conditionally set that property based on a variable value to selectively enable or disable those associations?

Since the lambda replicas will be deleted by cloudfront with eventual consistency once the associations to the cloudfront distro has been deleted is it enough to delete the associations on destroy and trust cloudfront will eventually get them cleaned up or is the aws api erroring deleting the main lambda function due to the fact that replicas exist?

Either way, until AWS fixes their issue, which seems like they aren't going to be doing anytime soon, is it possible to have the deletion be treated as a warning so it doesn't just completely block anyone trying to use it now? Even if we get a warning and have to clean up the function manually at a later date that seems the best we can hope for until AWS addresses this properly.

Thanks!

David Calhoun · Answer 13 · Sat Jul 21 2018 00:36:09 GMT+0800 (China Standard Time)

For the sake of posterity, based on the Deleting Lambda@Edge Functions and Replicas documentation stating replicas are "deleted within a few hours" I was able to...

Apply Terraform config to remove the Lambda association from my CloudFront distribution.
Wait for ~1 hour.
Apply Terraform config to rename my Lambda function (which requires deletion of the original).
Apply Terraform config to add the Lambda association to my CloudFront distribution.

So, if your use-case doesn't require everything occurring in a single Terraform apply, you should be able to delete/rename Lambda functions if you wait long enough for the replicas to be deleted after all associations to the Lambda are removed.

Annoying and limiting, but worked in my simple case.

Dmitry Erman · Answer 14 · Sat Aug 04 2018 13:35:07 GMT+0800 (China Standard Time)

@dcalhoun

Terraform v0.11.7 - Throws error but actually applies all the other changes for me.

TL;DR

You can also remove it from CF via the console then wait 30 minutes. You don't have to do it via TF per say.

BUT..... While that works, I don't see how I can use this approach for a production environment given that the Lambda@Edge function serves live traffic rules. If I remove it for an hour I'm creating a production system outage for an hour. Which means that if you use Lambda@Edge and you need to make a change your SOL and have to take an outage.

FYI - Amazon has the same problem with Cloud Formation templates as well.

Keith Gable · Answer 15 · Sat Aug 04 2018 14:27:15 GMT+0800 (China Standard Time)

Also, CloudFront takes 15ish minutes to deploy a change, so every part of this sucks. You could consider fixing this with an A and a B distribution which sit behind an edge distribution via a cache behavior, update the inactive one, then change the edge to point at the inactive one. But that will take 15 minutes to deploy, during which you can't reliably know if A or B is being used. So if you needed to quickly iterate, forget about it.

David Calhoun · Answer 16 · Sat Aug 04 2018 19:09:21 GMT+0800 (China Standard Time)

@dmitrye for your production use case, you could likely do the following to avoid service interruptions...

Apply Terraform config to create a duplicate Lambda with any changes (e.g. rename), remove old Lambda association, and add new Lambda association.
Wait ~1 hour for old Lambda replicas to be deleted.
Apply Terraform config to delete old Lambda.

Dmitry Erman · Answer 17 · Thu Aug 09 2018 14:59:06 GMT+0800 (China Standard Time)

@dcalhoun Yes, that would avoid an outage. But then I have to remember to clean up old lambda@edge. Unused lambda's don't cost me, so theoretically you could have a weekly script/job via TF that cleans up old functions from your config.

Would be nice if TF incremented the definition name with an ID (or some other tokenizer) so that each compile generates a new function instead of updating existing one.

AdeOpe · Answer 18 · Sat Feb 09 2019 00:53:30 GMT+0800 (China Standard Time)

i too also hit the same problem today :(

James Limmer · Answer 19 · Fri Apr 26 2019 08:35:47 GMT+0800 (China Standard Time)

Yep. This is an issue. The terraform destroy marks as failed / error because of this one lambda @ edge.

Amit Sehgal · Answer 20 · Wed Jun 05 2019 23:04:48 GMT+0800 (China Standard Time)

Is there a way to ignore only this Error and make the stack destroy pass ?
Error deleting Lambda Function: InvalidParameterValueException: Lambda was unable to delete arnxxx because it is a replicated function. Please see our documentation for Deleting Lambda@Edge Functions and Replicas.

LevinDmytro2 · Answer 21 · Thu Dec 26 2019 18:20:23 GMT+0800 (China Standard Time)

Is it possible to realize the expectation of removal of replicas, for example, through a timeout function, so that the terraform itself checks the fact of deleting replicas and then hangs without error?
Or maybe I can do it by scripting in tf file?

Jeff D · Answer 22 · Fri Apr 17 2020 02:38:54 GMT+0800 (China Standard Time)

I also ran into this issue when trying to rename a lambda@edge function (essentially replacing an existing function with a new one). The Cloudfront update would fail as terraform is unable to delete the lambda function immediately. My workaround was to delete the function from state first:

terraform state rm aws_lambda_function.my_lambda

And then run terraform apply. Terraform was then able to create the the new lambda function, and update cloudfront. Still had to manually delete the old lambda though.

Justin Bailey · Answer 23 · Wed Nov 18 2020 05:39:16 GMT+0800 (China Standard Time)

3 years later and this is still a problem ... really, the fix needs to come from Amazon. If they didn't treat trying to delete a replicated function as an error, this all goes away. I put up a forum thread, please check it out if you agree https://forums.aws.amazon.com/thread.jspa?threadID=331402

surendarkaniops · Answer 24 · Fri Sep 17 2021 17:05:18 GMT+0800 (China Standard Time)

I have workaround for this issue, this helped me to overcome this known issue

resource "time_sleep" "wait_30_seconds" {
depends_on = [module.lambda_at_edge] ## if you are using resource block you can change

destroy_duration = "1200s"
}

resource "null_resource" "next" {
depends_on = [time_sleep.wait_30_seconds]
}

andreosipov · Answer 25 · Mon Sep 27 2021 20:33:53 GMT+0800 (China Standard Time)

I have workaround for this issue, this helped me to overcome this known issue

resource "time_sleep" "wait_30_seconds" {
depends_on = [module.lambda_at_edge] ## if you are using resource block you can change

destroy_duration = "1200s"
}

resource "null_resource" "next" {
depends_on = [time_sleep.wait_30_seconds]
}

thanks a lot, looks like it works
however i have a question: could you please explain how this works? from what i see you need to remove association between lambda and cloudfront, but this workaround simply adds a timeout before lambda edge destruction; so how does this piece of terraform code remove lambda from cloudfront behavior?

surendarkaniops · Answer 26 · Thu Sep 30 2021 21:09:48 GMT+0800 (China Standard Time)

I have workaround for this issue, this helped me to overcome this known issue
resource "time_sleep" "wait_30_seconds" {
depends_on = [module.lambda_at_edge] ## if you are using resource block you can change
destroy_duration = "1200s"
}
resource "null_resource" "next" {
depends_on = [time_sleep.wait_30_seconds]
}

thanks a lot, looks like it works however i have a question: could you please explain how this works? from what i see you need to remove association between lambda and cloudfront, but this workaround simply adds a timeout before lambda edge destruction; so how does this piece of terraform code remove lambda from cloudfront behavior?

Hi, when you do "terraform destroy" terraform destroy the cloud front first and lambda.
and we dont need to create a code to remove association between cloudfront and lambda.
basically we lambda@edge functions can delete after 15 or 20 Minutes after cloudfront deleted.

sample code helps you to delete the lambda function after 1200 seconds.

Josh Nisenson · Answer 27 · Fri May 06 2022 04:04:44 GMT+0800 (China Standard Time)

For the sake of posterity, based on the Deleting Lambda@Edge Functions and Replicas documentation stating replicas are "deleted within a few hours" I was able to...

Apply Terraform config to remove the Lambda association from my CloudFront distribution.

Wait for ~1 hour.

Apply Terraform config to rename my Lambda function (which requires deletion of the original).

Apply Terraform config to add the Lambda association to my CloudFront distribution.

So, if your use-case doesn't require everything occurring in a single Terraform apply, you should be able to delete/rename Lambda functions if you wait long enough for the replicas to be deleted after all associations to the Lambda are removed.

Annoying and limiting, but worked in my simple case.

@dcalhoun Can you tell me how to remove the association? What does that look like?

David Calhoun · Answer 28 · Fri May 06 2022 05:27:36 GMT+0800 (China Standard Time)

@dcalhoun Can you tell me how to remove the association? What does that look like?

@josh803316 it has been almost 4 years since I posted this, so I struggle to recall the specific steps. 😅 I imagine it involves removing the lambda_function_association configuration argument from your Terraform config file and applying that change with Terraform CLI apply (or whatever method you use to apply Terraform configuration).

diff --git a/example.yml b/example.yml
index 4b843c1..929c01f 100644
--- a/example.yml
+++ b/example.yml
@@ -4,11 +4,5 @@ resource "aws_cloudfront_distribution" "example" {
   # lambda_function_association is also supported by default_cache_behavior
   ordered_cache_behavior {
     # ... other configuration ...
-
-    lambda_function_association {
-      event_type   = "viewer-request"
-      lambda_arn   = aws_lambda_function.example.qualified_arn
-      include_body = false
-    }
   }
 }

Hope this helps!

Josh Nisenson · Answer 29 · Fri May 06 2022 05:42:05 GMT+0800 (China Standard Time)

@dcalhoun Yes it does and it's the same thing I arrived at as well. The good news is that it does allow me to destroy all the resources except the lambda function, the bad news is the destroy still ends up failing because of the issue above (replication and lambdas not being destroyed immediately). Thanks so much for the suggestion!

Dirk Avery · Answer 30 · Sat Aug 06 2022 02:58:49 GMT+0800 (China Standard Time)

Based on the age of this issue, I would say that at this point it needs to be thoroughly researched again to ensure it is still an issue. There may be an upstream component to this but in the provider we may be able to do something to improve the behavior (e.g., eating the error or trying eventual consistency approaches to wait for CF).

Eddie Herbert · Answer 31 · Mon Aug 29 2022 19:34:45 GMT+0800 (China Standard Time)

Based on the age of this issue, I would say that at this point it needs to be thoroughly researched again to ensure it is still an issue. There may be an upstream component to this but in the provider we may be able to do something to improve the behavior (e.g., eating the error or trying eventual consistency approaches to wait for CF).

It's still indeed an issue. The workaround above — #1721 (comment) — works, however we should look into adding native support for this in the provider itself.

Jason Bosco · Answer 32 · Tue Jan 24 2023 08:54:08 GMT+0800 (China Standard Time)

Based on the age of this issue, I would say that at this point it needs to be thoroughly researched again to ensure it is still an issue. There may be an upstream component to this but in the provider we may be able to do something to improve the behavior (e.g., eating the error or trying eventual consistency approaches to wait for CF).

To add another data point - this is still an issue in Jan 2023.

ThePinzon · Answer 33 · Sat Feb 18 2023 02:17:36 GMT+0800 (China Standard Time)

I confirm this is still an issue for us.

Josh Nisenson · Answer 34 · Sat Feb 18 2023 02:31:04 GMT+0800 (China Standard Time)

It is still an issue for us as well.

Kit Ewbank · Answer 35 · Fri Feb 24 2023 06:11:07 GMT+0800 (China Standard Time)

The proposed solution for this issue is to add a skip_destroy argument to the aws_lambda_function resource as has been done for

aws_cloudwatch_logs_group: #26775
aws_imagebuilder_component: #28905
aws_ecs_task_definition: #22269
aws_lambda_layer_version: #11997

and others. If the attribute is set to true then the Lambda function is removed from Terraform state without attempting a DeleteFunction API call.

Kit Ewbank · Answer 36 · Fri Feb 24 2023 06:13:33 GMT+0800 (China Standard Time)

Build on #29615.

Eddie Herbert · Answer 37 · Fri Feb 24 2023 06:44:41 GMT+0800 (China Standard Time)

That doesn’t resolve the issue though. It merely makes it so the user now has to manually delete the lambda. The workaround with the sleep and null resource are an appropriate method to maintain automated resources.

Richard Jennings · Answer 38 · Sat Feb 25 2023 00:29:07 GMT+0800 (China Standard Time)

@ewbankkit I would like to pick up your solution as a PR if no one else is working on it

Kit Ewbank · Answer 39 · Sat Feb 25 2023 00:55:32 GMT+0800 (China Standard Time)

@richardjennings I'll be doing the PR, likely submitted later today. Thanks though 👏.

Kit Ewbank · Answer 40 · Sat Feb 25 2023 00:56:50 GMT+0800 (China Standard Time)

@edwardofclt I also plan on retrying delete (up to a configurable timeout) if the error code indicates a replicated Lambda@Edge function.

github-actions · Answer 41 · Fri Mar 03 2023 22:06:24 GMT+0800 (China Standard Time)

This functionality has been released in v4.57.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions · Answer 42 · Tue Apr 04 2023 10:11:28 GMT+0800 (China Standard Time)

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.