spack / spackbot

Spack maintainer bot 🤖

Home Page:https://spack.github.io/spackbot/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spackbot is failing to trigger pipelines

scottwittenburg opened this issue · comments

After adding some debug prints in #51, and waiting a couple hours for the new version to get picked up by kube, spackbot is no longer fulfilling request to run pipelines, opting instead to say:

INFO:spackbot.handlers.pipelines:Author <whoever> is requesting a pipeline run.
ERROR:aiohttp.server:Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/app/spackbot/__main__.py", line 50, in main
    await router.dispatch(event, gh, session=session)
  File "/app/spackbot/routes.py", line 37, in dispatch
    await callback(event, *args, **kwargs)
  File "/app/spackbot/routes.py", line 107, in add_comments
    message = await handlers.run_pipeline(event, gh)
  File "/app/spackbot/handlers/pipelines.py", line 57, in run_pipeline
    async with aiohttp.ClientSession() as session:
AttributeError: module 'gidgethub.aiohttp' has no attribute 'ClientSession'

Earlier today after #51 was merged, I noticed that the spackbot pod was 43 days old:

$ kubectl get pods -n spack
NAME                                        READY   STATUS    RESTARTS   AGE
...
spackbot-spack-io-7b5cc4bdf4-l5whj          1/1     Running   0          43d
spackbot-spack-io-7b5cc4bdf4-pgspr          1/1     Running   0          43d
...

Does that mean that our k8s deployment might have last picked up a new version of the docker image 43 days ago as well?

The reason I mention it is because the change in #51 doesn't seem it could have possibly caused the error above. And knowing that the aiohttp module does have a ClientSession, I suspect the from gidgethub import aiohttp import in the pipelines.py module was intended to be just import aiohttp.

@scottwittenburg unfortunately right now spackbot is in a weird state and I have not had a chance to get back to it. Basically, we merged #22, and there was a secret added that we don't currently handle correctly in k8s. It was only in @vsoch's docker compose file (the RSA key). We have a test deployment at spackbot-dev.spack.io to debug the RSA issue (see the add/spackbot-test branch in spack-infrastructure, and also see #46 -- that's what we need to get working) but I haven't had a chance to go back and do this stuff with end of the year business.

So, short story: tip of main doesn't work and I deployed an older commit, which I thought we were still using. That error message indicates to me that there might be a newer revision deployed accidentally.

To me it seems that #37 could have introduced this issue, is it possible kube hasn't picked up the new spackbot image since then? I guess we are missing at least one test case too, which might have caught this. 😅

PR incoming...

I think we fixed this already? It's just that it was never deployed, the container URI needs to be updated.

We never fixed the bug; we just redeployed the old version.

It was, just never reviewed I guess :( > #46

I would not let a bug like that remain.

We never finished testing that PR, as we didn't get around to testing the RSA copy part with spackbot dev. So the old deployed version remained.

Yeah understood. Although adding spack style was merged into main already, so if you rebuild a container now it's going to erroneously provide that?

That PR actually got most of the underlying stuff working - the issue was that running spack style just froze and never went beyond that. I don't know what changed to cause that. But I did get that PR working for all respect to the idrsa stuff.

@scottwittenburg is going to build a container based on e340565 since we haven't really tested anything beyond that.

And by "haven't really tested" I mean "haven't gotten all the bugs out of the PR that has all the fixes"