EugenMayer / azure-agent-self-hosted-toolkit

Toolkit to run azure agents under linux

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WAT

Helps running azure (multiple) azure self-hosted agents on one machine (scoped).

Features

  • One-line command to set up n-agents (including indivdual os users per agent, systemd service). Downloads, setups and keeps them running.
  • Set's up a user and home folder for each agent automatically.
  • Supports running docker based jobs.
  • Uses --once mode to run the agents with automatic workdir cleanup and restart after job-end
  • offers batch-uninstalling agents
  • Helps setting up MTU vars

This is not

  • an orchestrator for k8s
  • this does not use docker-in-docker (though it can be used to run jobs in docker containers)

Details

run-once mode

The run-once mode is based on Microsoft's ./run.sh --once which ensures that an agents only runs 1 job and then stops. This is used to

  • cleanup the workdir in a safe manner after each job (ensuring no other job is scheduled yet)
  • ensures each job on an agent runs in a clean workdir
  • starts the agent right after cleanup up (few seconds) to be available for the next job
  • Use the original microsoft tools, binaries and all the bits. Be agent-upgrade ready.

This fixes issues like

Setup / Usage

Requirements

  • curl

One agents

For examples, sets up and starts one agent called agent0 for the pool Default enabling the run-once mode

./agent-setup agent0 <PAT> Default 1

You can also add an additional agent, with an MTU (for the docker network) of 1400

./agent-setup agent1 <PAT> Default 1 1400

Many agents

X agents, creates 15 agents (agent0....agent14) in the pool Default with an MTU of 1400

./batch-setup.sh 0 15 <PAT> Default 1 1400

Uninstall

The below commands does uninstall

  • uninstall the systemd service
  • de-register the agent from the pool
  • remove the agent install dir
  • remove the agent user home folder
  • remove the agent user

So there should be nothing left after running this for one agent.

One agents

./agent-uninstall agent0 <PAT>

Many agents

Uninstalls agent 0..14 from the Default pool

./batch-uninstall.sh 0 15 <PAT> Default 

Troubleshooting

100s delay before starting a job

Bug in the agent, see microsoft/azure-pipelines-agent#4215 workaround by blocking the network request

sudo ip route add blackhole 169.254.169.254

This sends the stalling requests into a direct-drop gw and speeds up the start by 100s.

Contributions

Anytime, just open PRs. Happy to extend whatever we have here.

About

Toolkit to run azure agents under linux


Languages

Language:Shell 100.0%