GoogleCloudPlatform / guest-agent

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

More documentation about "Compute Engine cannot guarantee that the shutdown script will complete"

pgillich opened this issue · comments

The https://cloud.google.com/compute/docs/shutdownscript#limitations states:

- Compute Engine executes shutdown scripts only on a best-effort basis. In rare cases, Compute Engine cannot guarantee that the shutdown script will complete.

I spent a lot of hours to make sure the shutdown script will be complete, because the correctly defined shutdown script has not been invoked or if it's invoked, it was killed after a few moments.

The used configuration is:

  • VM provisioning model: Spot
  • E2 medium
  • Ubuntu 20.04 LTS and Ubuntu 20.04 LTS Minimal (provided by GCE VM wizard)
  • Scripts in Metadata: startup-script, shutdown-script

See more details at the end of https://faun.pub/minecraft-server-on-digitalocean-with-vpn-140730681e3a , section Google Cloud and the output of CREATE SIMILAR wizard (except SSH info):

gcloud compute instances create minecraft-1 --project=basic-computing-360703 --zone=europe-central2-a --machine-type=e2-medium --network-interface=network-tier=PREMIUM,subnet=minecraft-i --metadata=^,@^shutdown-script=/home/kodcsakany/docker-minecraft-server/spot_mc.sh\ stop,@startup-script=/home/kodcsakany/docker-minecraft-server/spot_mc.sh\ start --no-restart-on-failure --maintenance-policy=TERMINATE --preemptible --provisioning-model=SPOT --instance-termination-action=STOP --service-account=843003590676-compute@developer.gserviceaccount.com --scopes=https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append --create-disk=auto-delete=yes,boot=yes,device-name=minecraft,image=projects/ubuntu-os-cloud/global/images/ubuntu-minimal-2004-focal-v20220817,mode=rw,size=50,type=projects/basic-computing-360703/zones/europe-central2-a/diskTypes/pd-standard --no-shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring --reservation-affinity=any

So, after forking the source code of https://github.com/GoogleCloudPlatform/guest-agent/tree/main/google_metadata_script_runner and adding debug messages, I realized, the script runner works well. The root cause is around the Systemd services.

I suggest to add a new paragraph about troubleshooting tips for shutdown script invocation:


Introduction:

  • The startup and shutdown scripts are invoked by Systemd services, provided by Google.
  • Google cannot guarantee, the stop will be called on Systemd service of shutdown by the Systemd in 30 s and it will be finished on time. It depends on another Systemd sercvices, not on Google.

Design rules:

  • If the startup or shutdown script was not invoked, make sure again, the Metadata is set both of them, because of a bug it's not possible to save startup and shutdown script together.
  • Use only the really needed Systemd services, because a few of them can block the stop of other services, for example: unattended-upgrades.service. If the blocking stop operation takes long time, the TimeoutStopSec can be decreased by systemctl edit unattended-upgrades.service or disable shutdown-time upgrade in /etc/apt/apt.conf.d/50unattended-upgrades
  • The Minimal images have less Systemd services. Each Systemd service stop consumes the short time of shutdown (30 s).
  • Do not write extra logs of shutdown script to /tmp directory, because it will be deleted next boot.
  • If a service must run until the stop operation of shutdown script, add dependency to the shutdown script service, for example (service: docker.service) sudo systemctl edit google-shutdown-scripts.service:
[Unit]
Requires=docker.service
After=docker.service

Troubleshooting tips:

  • If the shutdown script was not invoked, get the last few minutes of the logs by sudo journalctl --since '10 min ago' and find the shutdown section between System is powering down. and -- Reboot --. Identify messages of Google Compute Engine Shutdown Scripts and google_metadata_script_runner, check the the distance of invocation from the timestamp of powering down message and Finished to -- Reboot -- message.
  • The logs of shutdown script can be printed out by sudo journalctl -u google-shutdown-scripts command.
  • The status of shutdown script runner service can print out by systemctl status google-shutdown-scripts.service command.

Thank you for your feedback. For documentation updates, please navigate to the documentation you wish to update and find the button marked 'send feedback' at the bottom of the page. We are only able to accept feedback related to the operation of the guest agent via github issues.