Question: Safest way to manually clean the build agent's work folder
andmos opened this issue · comments
Have you tried trouble shooting?
Agent Version and Platform
Mainly 1.95.3
but also 2.123.0
Windows Server 2012 / Windows Server 2016
VSTS Type and Version
On-Prem TFS
Tfs2015 Update3
What's not working?
This is more a question.
We run on-prem TFS2015 agent with the old 1.95.3
agent and have a LOT of repositories.
We haven't found any way to do global retention settings on the build agents, so the disk fills up over time.
Until know we have recycled the build servers themselves now and then, but sometimes we need to do manual cleanup on the agents themselves. I'm just wondering what is the safest way to manually clean out the agents of old builds?
Is it OK to delete stuff under _work
, or does folders like _tasks
need to be handles with care?
We are also running a few agents in the old VSTS (now AzureDevOps I guess) on version 2.123.0
, can the same cleanup strategy be used here in a pinch?
@andmos here is what i would suggest:
- rename the current
_work
to_work_bk
(you can delete the entire_work
folder, but backup the current work folder just in case). - queue definitions that you really need them to work to the agent, the agent should should recreate the entire
_work
folder for you. - if everything works fine, you can just delete
_work_bk
folder created in step 1.
@TingluoHuang - you want to make sure it's not in progress of a build right? Diable and then query not busy?
Thanks @bryanmacfarlane
@andmos let's add step 0. :D
- Make sure there is no running job on the agent and stop the agent process/service first before delete/rename any folder.
Cool. @TingluoHuang and @bryanmacfarlane! What we have tried before hand is this cleanup script:
Get-ChildItem -recurse -directory 'c:\agent\*work' | where { (get-date) - $_.lastwritetime -gt 10. } | remove-item -recurse -Force
So we remove all files older than 10 days. On the VSTS/AzureDevOps agent we this error:
"Could not find a part of the path 'C:\VSTSAgent\_work\_tasks\PublishTestResults_0b0f01ed-7dde-43ff-9cbb-e48954daf9b1\2.0.13\task.json'."
I guess that is because the folder itself was not deleted, only single files within?
yes, i would not suggest you to change/delete file within your work folder. the agent can't handle that.
you can delete the entire work folder but never try delete some part of it. :)
you might interesting on following stuff: http://www.codewrecks.com/blog/index.php/2017/05/06/maintenance-for-build-agent-in-tfs-build/
@bryanmacfarlane don't know if you guys want to keep this issue up to update some documentation on the subject? I guess I'm not the only one who might be interested in the topic 😃
I am going to close this issue for now. :)
Anybody who might land here again :D :D.
Instead of manually cleaning "_work" folder, "maintenance" can be scheduled for the Agent/Pool.
On running maintenance, following was observed(basis maintenance configuration):
- Unused build directories - Deleted.
- GIT Repository -
git repack -adfl
&git prune -v
commands were shown in logs. - Unused release directories - Deleted.
P.S. : Follow at your own risk :) ;)
@rishibamba How do you schedule maintenance for the Agent/Pool? I don't see any option in the UI for that ...
@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings
@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings
FYI, this is what the Maintenance job UI looks like
Looks like this is only available in tfs/azure devops server.
Looks like this is only available in tfs/azure devops server.
The information provided above was given on checking "Azure DevOps Services". Still available at the path mentioned.
Thanks :)
@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> SettingsFYI, this is what the Maintenance job UI looks like
Not available for Azure Devops Server 2019
@jklemmack
For Me: Organization Settings -> Agent Pools -> Select Your Agent Pool -> Settings
Note that there's a different in available options between the Project Settings and Collection Settings screens. This was available to me when going through Collection Settings but not Project Settings.
Since running pre-cleanup-jobs (workspace.clean
) or post-cleanup-jobs some step X
to cleanup the agent workdir
heavily relies on the pipeline-author of "some pipeline" to fix a potential hard to detect "workdir pollution" issue, we decided to re-implement the idea to guarantee workdir pollution can never happen.
I have created https://github.com/EugenMayer/azure-agent-self-hosted-toolkit which fixes this project on the agent-level, not requiring any changes to the pipelines nor relying on those.
To quote to project idea
The run-once mode is based on Microsoft's `./run.sh --once` which ensures that an agents only runs 1 job and then stops.
This is used to
- cleanup the workdir in a safe manner after each job
- ensures each job on an agent runs in a clean workdir
- starts the agent right after cleanup up (few seconds) to be available for the next job
This repository also offers a toolkit to start / setup x-agents and maintain them using the original tools of Microsoft, but wrapped in convenient scripts.
If this helps anybody else, happy to share it.