spotify / helios

Docker container orchestration platform

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OldJobReaper should not garbage collects jobs in use by empty DeploymentGroups

mattnworb opened this issue · comments

Currently the OldJobReaper will remove any helios job that is not deployed, meaning that there is no host that the master is aware of that should be running the job.

Internally at Spotify, usually when we deploy a new revision of a service/container to Helios, we create the Job, a DeploymentGroup, and do a rolling-update job group on all helios clusters that we have, even if there are no hosts currently in that deployment group. The intention here is to make it possible for the service/container owners to be able to easily add hosts/capacity in a site that currently has no hosts of that group/selector and have Helios automatically deploy the same job that is running in other sites to the new hosts in this new site.

The OldJobReaper causes problems with this intention though, as sometimes the job that has been associated with the empty DeploymentGroup has been deleted. This means that when the service/container owner provisions new hosts for the first time in a site that currently has no hosts in that group, their job is sometimes not automatically deployed. This leads to inconsistent experience for these service/container owners, as sometimes when they add capacity to a new site the Helios job is automatically deployed, and sometimes it is not.

I would propose that ideally OldJobReaper should add a condition when deciding to reap a job to check that the job is not currently associated with any DeploymentGroup entities, preventing the inconsistent behavior mentioned above.

One possible way to implement this would be to have the OldJobReaper, at the start of an iteration, fetch all existing DeploymentGroups, and then when iterating over each job to decide whether to reap it check to see if any of the DeploymentGroups have that job in their deploymentGroup.jobId field. Fetching the set of DeploymentGroups at startup would avoid the need for a O(n^2) algorithm like for job in model.getJobs(), for group in model.getDeploymentGroups: ....

commented

I think this is probably fine, and a worthwhile improvement for our users. It's sensible to not remove objects when other objects depend on them.

I don't think we have an OldDeploymentGroupReaper though, so the next natural question is whether we'd want to introduce something like that. My hunch is that the number and on-disk cost of these is very small, to the point where we don't really need to worry about it.

I agree there isn't much point in GC-ing old deployment groups. I would add though that doing so would also work against the intent outlined above, where an empty deployment group is created in a helios cluster intentionally and meant to exist in case the service owner ever wants to create hosts that would belong to that group in that site in the future.