microsoft / azure-pipelines-agent

Azure Pipelines Agent 🚀

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: Safest way to manually clean the build agent's work folder

andmos opened this issue · comments

Have you tried trouble shooting?

Trouble shooting doc

Agent Version and Platform

Mainly 1.95.3
but also 2.123.0

Windows Server 2012 / Windows Server 2016

VSTS Type and Version

On-Prem TFS
Tfs2015 Update3

What's not working?

This is more a question.
We run on-prem TFS2015 agent with the old 1.95.3 agent and have a LOT of repositories.
We haven't found any way to do global retention settings on the build agents, so the disk fills up over time.
Until know we have recycled the build servers themselves now and then, but sometimes we need to do manual cleanup on the agents themselves. I'm just wondering what is the safest way to manually clean out the agents of old builds?
Is it OK to delete stuff under _work, or does folders like _tasks need to be handles with care?

We are also running a few agents in the old VSTS (now AzureDevOps I guess) on version 2.123.0, can the same cleanup strategy be used here in a pinch?

@andmos here is what i would suggest:

  1. rename the current _work to _work_bk (you can delete the entire _work folder, but backup the current work folder just in case).
  2. queue definitions that you really need them to work to the agent, the agent should should recreate the entire _work folder for you.
  3. if everything works fine, you can just delete _work_bk folder created in step 1.

@TingluoHuang - you want to make sure it's not in progress of a build right? Diable and then query not busy?

Thanks @bryanmacfarlane
@andmos let's add step 0. :D

  1. Make sure there is no running job on the agent and stop the agent process/service first before delete/rename any folder.

Cool. @TingluoHuang and @bryanmacfarlane! What we have tried before hand is this cleanup script:

Get-ChildItem -recurse -directory 'c:\agent\*work' | where { (get-date) - $_.lastwritetime -gt 10. } | remove-item -recurse -Force

So we remove all files older than 10 days. On the VSTS/AzureDevOps agent we this error:
"Could not find a part of the path 'C:\VSTSAgent\_work\_tasks\PublishTestResults_0b0f01ed-7dde-43ff-9cbb-e48954daf9b1\2.0.13\task.json'."

I guess that is because the folder itself was not deleted, only single files within?

yes, i would not suggest you to change/delete file within your work folder. the agent can't handle that.
you can delete the entire work folder but never try delete some part of it. :)
you might interesting on following stuff: http://www.codewrecks.com/blog/index.php/2017/05/06/maintenance-for-build-agent-in-tfs-build/

@bryanmacfarlane don't know if you guys want to keep this issue up to update some documentation on the subject? I guess I'm not the only one who might be interested in the topic 😃

I am going to close this issue for now. :)

commented

Anybody who might land here again :D :D.

Instead of manually cleaning "_work" folder, "maintenance" can be scheduled for the Agent/Pool.

On running maintenance, following was observed(basis maintenance configuration):

  1. Unused build directories - Deleted.
  2. GIT Repository - git repack -adfl & git prune -v commands were shown in logs.
  3. Unused release directories - Deleted.

P.S. : Follow at your own risk :) ;)

@rishibamba How do you schedule maintenance for the Agent/Pool? I don't see any option in the UI for that ...

commented

@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings

@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings

FYI, this is what the Maintenance job UI looks like

image

Looks like this is only available in tfs/azure devops server.

commented

Looks like this is only available in tfs/azure devops server.

The information provided above was given on checking "Azure DevOps Services". Still available at the path mentioned.
Thanks :)

@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings

FYI, this is what the Maintenance job UI looks like

image

Not available for Azure Devops Server 2019

Note that there's a different in available options between the Project Settings and Collection Settings screens. This was available to me when going through Collection Settings but not Project Settings.

Since running pre-cleanup-jobs (workspace.clean) or post-cleanup-jobs some step X to cleanup the agent workdir heavily relies on the pipeline-author of "some pipeline" to fix a potential hard to detect "workdir pollution" issue, we decided to re-implement the idea to guarantee workdir pollution can never happen.

I have created https://github.com/EugenMayer/azure-agent-self-hosted-toolkit which fixes this project on the agent-level, not requiring any changes to the pipelines nor relying on those.

To quote to project idea

The run-once mode is based on Microsoft's `./run.sh --once` which ensures that an agents only runs 1 job and then stops.
This is used to

 - cleanup the workdir in a safe manner after each job
 - ensures each job on an agent runs in a clean workdir
 - starts the agent right after cleanup up (few seconds) to be available for the next job

This repository also offers a toolkit to start / setup x-agents and maintain them using the original tools of Microsoft, but wrapped in convenient scripts.

If this helps anybody else, happy to share it.