argoproj / argo-workflows

Workflow Engine for Kubernetes

Home Page:https://argo-workflows.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`no artifact logs are available` when workflow is archived but still live

liudongqing opened this issue · comments

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issue exists when I tested with :latest
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

We just upgrade the argo workflow from 3.4.4 to 3.5.5. We enabled archive

persistence:
    connectionPool:
      maxIdleConns: 100
      maxOpenConns: 0
    # save the entire workflow into etcd and DB
    nodeStatusOffLoad: true
    # enable archiving of old workflows
    archive: true
    postgresql:

but didn't enable archive logs.

artifactRepository:
  # -- Archive the main container logs as an artifact
  archiveLogs: false

Before upgrade, we can see logs of the finished workflow (either success or fail) from UI(the server gets the log from pod I guess), but after upgrade, the UI will complain " no artifact logs are available " and no logs returned.

Is it an expected result ? or is any configuration item controlling this behavior ?

Version

v3.5.5

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

any workflows

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

from UI(the server gets the log from pod I guess)

Correct, it retrieves Pod logs.

but after upgrade, the UI will complain " no artifact logs are available " and no logs returned.

I'm not sure that this is related to the upgrade? You changed your configuration after the upgrade? Or before it?

An Archived Workflow is typically a deleted Workflow, therefore there are no Pods for it to retrieve logs from. So if you want logs for deleted Pods, you can either link to a log provider or use artifact logs. You don't have artifact logs, so the error message certainly sounds correct.

An Archived Workflow is typically a deleted Workflow, therefore there are no Pods for it to retrieve logs from. So if you want logs for deleted Pods, you can either link to a log provider or use artifact logs. You don't have artifact logs, so the error message certainly sounds correct.

We didn't change any configuration during the upgrade, the only change is the image tag from "v3.4.4" to "v3.5.5". The problem is, the workflow will be archived once the workflow finished, we have no chance to check the log event it is failed just 1 min before. By enabling the artifacts logs, we can see log now.

Is it correct for a finished workflow became archived immediately?

Is it correct for a finished workflow became archived immediately?

A Workflow is labeled for archiving when it completes and when that label is detected, archiving is kicked off

That is generally independent of deletion, however, which is based on your TTL or retentionPolicy.
It sounds like you have a longer TTL potentially, and so you have Workflows that are simultaneously in the archive and still in the cluster? In that case, the pod logs should still be retrievable.

I think I see the issue here, it's probably not falling back to Pod logs properly in 3.5.

3.5 unified the Archived + Live UI into one page (#11121) so there is no distinction now in the UI. In particular, this line would previously only be triggered if you were navigating archived workflows specifically, but now it can be triggered on a live workflow that is also archived. The comment above that line is not quite correct in your case