microsoft / ga4gh-tes

C# implementation of the GA4GH TES API; provides distributed batch task execution on Microsoft Azure

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Azure Batch start-up script optimization

jlester-msft opened this issue · comments

Describe the bug
The current start-up script optionally installs jq. This install can take anywhere between 10-15seconds and since it is only sometimes installed we can't rely upon jq to always be available.

The current bash script is essentially:

# /usr/bin/bash -c 'trap "echo Error trapped; exit 0" ERR; sudo touch tmp2.json &&\
(sudo cp /etc/docker/daemon.json tmp1.json || sudo echo {} >tmp1.json) &&
    sudo chmod a+w tmp?.json &&
if fgrep "$(dirname "$(dirname "$AZ_BATCH_NODE_ROOT_DIR")")/docker" tmp1.json; 
    then echo grep "found docker path"; 
elif [ $? -eq 1 ]; 
    then sudo apt-get install -y jq &&
        jq \.\[\"data-root\"\]=\""$(dirname "$(dirname "$AZ_BATCH_NODE_ROOT_DIR")")/docker"\" tmp1.json >>tmp2.json &&
        sudo cp tmp2.json /etc/docker/daemon.json &&
        sudo chmod 644 /etc/docker/daemon.json &&
        sudo systemctl restart docker &&
        echo "updated docker data-root";
else (echo "grep failed" || exit 1); 
fi

The change is to replace the jq command:

jq \.\[\"data-root\"\]=\""$(dirname "$(dirname "$AZ_BATCH_NODE_ROOT_DIR")")/docker"\" tmp1.json >>tmp2.json &&

With the following single line of python3:

python -c "import json,os;data=json.load(open(\"tmp1.json\"));data[\"data-root\"]=os.path.join(os.path.dirname(os.path.dirname(os.getenv(\"AZ_BATCH_NODE_ROOT_DIR\"))), \"docker\");json.dump(data, open(\"tmp2.json\", \"w\"), indent=2);"

In a more human readable form this python3 script does the same thing the jq command does:

import json,os

# Load the JSON data from tmp1.json
data=json.load(open("tmp1.json"))
data["data-root"]=os.path.join(os.path.dirname(os.path.dirname(os.getenv("AZ_BATCH_NODE_ROOT_DIR"))), "docker")
# Write out the data struct to tmp2.json, with pretty-printing
json.dump(data, open("tmp2.json", "w"), indent=2)

This takes the start-up script runtime down to <1s instead of 10-15s. Python3 is available on all Ubuntu 20.04+ versions and CentOS 7.7+, and the script only uses base python3 packages.

Suggested changes
Change the start-up to script to this one liner:

/usr/bin/bash -c 'trap "echo Error trapped; exit 0" ERR; sudo touch tmp2.json && (sudo cp /etc/docker/daemon.json tmp1.json || sudo echo {} > tmp1.json) && sudo chmod a+w tmp?.json && if fgrep "$(dirname "$(dirname "$AZ_BATCH_NODE_ROOT_DIR")")/docker" tmp1.json; then echo grep "found docker path"; elif [ $? -eq 1 ]; then python -c "import json,os;data=json.load(open(\"tmp1.json\"));data[\"data-root\"]=os.path.join(os.path.dirname(os.path.dirname(os.getenv(\"AZ_BATCH_NODE_ROOT_DIR\"))), \"docker\");json.dump(data, open(\"tmp2.json\", \"w\"), indent=2);" && sudo cp tmp2.json /etc/docker/daemon.json && sudo chmod 644 /etc/docker/daemon.json && sudo systemctl restart docker && echo "updated docker data-root"; else (echo "grep failed" || exit 1); fi'