awslabs / aws-emr-launch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUERY] - EMR stuck at bootstrapping

awsjputro opened this issue · comments

commented

Describe the bug
This may not be a bug because the root cause is not yet identified. This is more of a question in case this is a known problem/symptom.

I used the same exact package and version with the same code successfully a few months ago (Jan 2023) but I have been encountering this problem in the last few weeks. The EMR launch workflow failed with bootstrapping phase took an hour before it eventually failed with internal error. It is confirmed not the bootstrapping script, but most likely the next step after bootstrapping. I tried different EMR version (initially 6.6.0, tried 6.8.0-6.9.0) and region but gave same result.
Was there a similar behaviour encountered by anyone else?
Any suggestion on where/what to investigate?

To Reproduce
Steps to reproduce the behavior:
Deploy and launch the EMR with the workflow. Error message: "Status": {"State": "TERMINATING", "StateChangeReason": {"Message": "An internal error occurred.""

Expected behavior
Successful EMR launch.

Screenshots

The behavior you describe is common with bootstrap, steps, or configuration issues, but not necessarily something caused by the aws-emr-launch CDK library. This can sometimes occur if the bootstrap script is successful but inadvertently returns a non-zero exit code. Any non-zero exit code from the bootstrap script is considered a failure and the cluster will be terminated.

commented

I have tried to ensure the bootstrap script return/exit with 0 (and confirmed in the output log) but the problem persists. I believe it is the next step right after that having the problem. But yeah I agree that it doesn't mean a problem related to aws-emr-launch package, so I will close this. Thank your for your advice.