nix-community / buildbot-nix

A nixos module to make buildbot a proper Nix-CI [maintainer=@Mic92]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Better error handling on boot

zimbatm opened this issue · comments

If GitHub is unavailable on boot, buildbot will fail ungracefully right now.

Currently the only calls to GitHub/Gitea we do is during the creation of webhooks. That can be postponed into the respective reloaders. It makes more sense to have it there in my opinion anyway.

commented

I have also noticed oopses when the user is not admin and have to restart the master to get past it.

Might not matter if admin is no longer needed with future changes, but better error handling here would be an improvement too.

@mannp can you elaborate a bit? I'm not sure what bug you're hitting.

commented

@mannp can you elaborate a bit? I'm not sure what bug you're hitting.

I use kanidm with forgejo oidc and had not then sorted allocating an admin user via oidc groupings.

That meant that I manually forced the buildbot-nix user as an admin locally, but when I logged into the buildbot oidc to cancel builds or the like, the forgejo buildbot user would be set back to a normal user.

Anyway, so that meant that sometimes if I restarted the buildbot services while that user was a none admin, the buildbot services would bork when seeing the user was not an admin and not managing it nicely.

I have since resolved the kanidm/forgejo oidc thing so don't come across the issue anymore.

oooooh okay I think I get it.

Back to the issue belonging to this github issue. So if I remember right, i do, during #156 I made the refresh of projects only be called (the part that does network requests) in the reload bulider, so network/github being down should not block buildbot boot up. And same thing applied from the start to Gitea. (I did check the code to verify this.) I'll test whether it indeed doesn't fail and close if that's alright with everyone.

Okay so testing it out, it does fail, refresh is postponed, but not the creation of hook :/

Was this fixed in this pull request? #203

Depends on how you look at it, there is nothing that can error at boot now so technically, yes :)

I think this is solved, buildbot no longer does anything at start up in the main thread that could error it out. So therefore we don't need any error handling on boot.