ChildishGirl / glue-data-pipeline

Serverless Glue data pipeline with Slack notifications (Python CDK)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Node 12.x no longer supported for Lambda function

clemenkok opened this issue · comments

Hi there,

"Runtime": "nodejs18.x",
    "Description": {
     "Fn::Join": [
      "",
      [
       "Lambda function for auto-deleting objects in ",
       {
        "Ref": "rawdata96BD17DC"
       },
       " S3 bucket."
      ]
     ]
    }

Within GluePipelineStack.template.json, we need to update nodejs12.x to nodejs18.x for successful deployment. I solved this by updating the CDK version in requirements.txt to 2.38.0.

Also, I had a weird bug when trying out your code but managed to fix it (see below - glue_pipeline_stack.py):

 glue_crawler = _glue.CfnCrawler(self, 'glue_crawler',
                         name='glue_crawler',
                         role=glue_role.role_arn,
                         database_name='price-database',
                         targets=_glue.CfnCrawler.TargetsProperty(
                             s3_targets=[_glue.CfnCrawler.S3TargetProperty(
                                 path=f's3://{raw_bucket.bucket_name}/',
                                 event_queue_arn=glue_queue.queue_arn)]),
                         recrawl_policy=_glue.CfnCrawler.RecrawlPolicyProperty(
                             recrawl_behavior='CRAWL_EVENT_MODE'))

I added glue_crawler = in front of _glue.CfnCrawler - else due to the order CDK deploys the code the Glue trigger will not deploy as Glue Crawler is not deployed first.

Otherwise, thanks for the great post! :-)

commented

Hi, thanks for highlighting this, I will update CDK version.
Usually, CDK infers the right order using resources configuration. I tried to reproduce the problem, but stack was deployed successfully several times. Anyway, I'm glad that it was fixed for you by adding a variable. I would say that generally adding dependency will be more reliable in this case.