Azure / login

Connect to Azure

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR: AADSTS700024: Client assertion is not within its valid time range

krukowskid opened this issue · comments

Hi! I am facing a similar issue (#180) that appears to have been resolved, but I'm still encountering this problem when executing dotnet tests in GitHub Runner.

Azure.Identity.CredentialUnavailableException : DefaultAzureCredential failed to retrieve a token from the included credentials. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/defaultazurecredential/troubleshoot
...
- Azure CLI authentication failed due to an unknown error. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/azclicredential/troubleshoot ERROR: AADSTS700024: Client assertion is not within its valid time range. Current time: 2023-10-31T11:53:04.4424859Z, assertion valid from 2023-10-31T11:39:49.0000000Z, expiry time of assertion 2023-10-31T11:44:49.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials . Trace ID: d64c537e-1d94-4274-9012-c0d7590f1c00 Correlation ID: 5c769bb7-e85a-4557-ba28-92f8eca1c4ff Timestamp: 2023-10-31 11:53:04Z
	Interactive authentication is needed. Please run:
	az login

I'm using action version 1.4.6 and azure.identity package version 1.10.4 + DefaultAzureCredential(). The issue doesn't occur on integration tests where nearly all of them utilize tokens. However, if I run API/UI tests where I employ identity in one or two tests, it fails with above error. Do you have any suggestions or workarounds?

Hi @krukowskid , could you provide the workflow file, run it again with debug mode, and provide the debug log?

same issue here this is a real pain. The token are only valid for 5 minutes, and if you don't use it until very far in your workflow, then it just throw the error shown by OP

I tried azure/login@1.5.0 same issue. I'm not using any other way to login into azure.

same issue here this is a real pain. The token are only valid for 5 minutes, and if you don't use it until very far in your workflow, then it just throw the error shown by OP

I tried azure/login@1.5.0 same issue. I'm not using any other way to login into azure.

Hi @benjamin-rousseau-shift could you provide your workflow file and debug log? Do you also use OIDC login? OIDC login with SP should have an expiration of 1 hour and OIDC with User-assigned managed identity should have 24 hours.

I will try to give you that , I am using OIDC with a service principal using federated credentials.

@YanaXu

here is my workflow definition (its reusable workflow). I have also enabled debug but it doesnt make sense to paste it here because it's so noisy. Workflow is failing in 🧪 Run tests for specified filter and rerun failed step. I will provide debug logs, just let me know which part/step you are interested in

reusable workflow definition
name: 'reusable/run-tests'
on:
  workflow_call:
    inputs:
      environment:
        required: true
        type: string

      system-under-test:
        required: false
        type: string
        default: xwow

      test-configuration:
        required: true
        type: string

      tests-filter:
        description: 'Filter for selecting tests to run'
        required: true
        type: string

      tests-web-url:
        required: false
        type: string

      tests-apigateway-url:
        required: false
        type: string

      report-name:
        description: 'Name for execution report and attachments'
        required: false
        default: Default
        type: string

      allure-reports:
        required: false
        default: false
        type: boolean

      allure-project-id:
        required: false
        type: string

    secrets:
      KrukowskidBotAppId:
        required: false
      KrukowskidBotPrivateKey:
        required: false
      ad-username:
        required: false
      ad-password:
        required: false
      azure-client-id:
        required: false
      azure-tenant-id:
        required: false
      azure-subscription-id:
        required: false
      identity-url:
        required: false
      identity-client-id:
        required: false
      backoffice-identity-url:
        required: false
      backoffice-client-id:
        required: false
      backoffice-client-secret:
        required: false      
      backoffice-identity-scope:
        required: false
      allure-server-password:
        required: false

permissions:
  id-token: write
  contents: write
  actions: read
  checks: write

jobs:
  run-tests:
    name: run-tests
    environment: ${{ inputs.environment }}
    runs-on:
      labels: ubuntu-latest-8core32ram
    timeout-minutes: 20
    env:
      E2E-ENVIRONMENT: ${{ inputs.test-configuration }}
      E2E-SUT: ${{ inputs.system-under-test }}
      ALLURE_SERVER_URL: ${{ vars.ALLURE_SERVER_URL }}
      ALLURE_SERVER_USER: ${{ vars.ALLURE_SERVER_USER }}
      ALLURE_SERVER_PASSWORD: ${{ secrets.allure-server-password }}
    defaults:
      run:
        shell: pwsh
    steps:
    - name: Generate token
      if: ${{ github.repository != 'Krukowskid/Krukowskid.Tests' }}
      id: generate_token
      uses: tibdex/github-app-token@v1
      with:
        app_id: ${{ secrets.KrukowskidBotAppId }}
        private_key: ${{ secrets.KrukowskidBotPrivateKey }}

    - name: Checkout Tests
      if: ${{ github.repository != 'Krukowskid/Krukowskid.Tests' }}
      uses: actions/checkout@v3
      with:
        repository: Krukowskid/Krukowskid.Tests
        token: "${{ steps.generate_token.outputs.token }}"
        ref: main
        
    - name: Checkout Tests
      if: ${{ github.repository == 'Krukowskid/Krukowskid.Tests' }}
      uses: actions/checkout@v3
        
    - name: Azure login
      uses: Azure/login@v1.4.6
      with:
        client-id: ${{ secrets.azure-client-id }}
        tenant-id: ${{ secrets.azure-tenant-id }}
        subscription-id: ${{ secrets.azure-subscription-id }}

    - name: Setup .NET
      uses: actions/setup-dotnet@v3
      with:
        dotnet-version: 7.0.x

    - name: Check Other Chrome Version
      run: /usr/bin/google-chrome --version
    
    - name: Restore dependencies
      run: dotnet restore src

    - name: List Config Files
      run: ls src/Krukowskid.Tests.Common/Krukowskid.Tests.Common.Configuration

    - name: Add TestResults dir
      run: | 
        mkdir src/TestAutomation
        mkdir src/TestAutomation/TestResults
        mkdir src/TestAutomation/TestResults/AllureReports
      
    - name: 🦿 Override WebUrl
      if: ${{ inputs.tests-web-url != '' }}
      shell: bash --noprofile --norc {0}
      run: |
        echo "Setting E2E_TESTS__WEB__URL env var to ${{ inputs.tests-web-url }}"
        echo "E2E_TESTS__WEB__URL=${{ inputs.tests-web-url }}" >> $GITHUB_ENV
    
    - name: 🦿 Override ApiGatewayUrl
      if: ${{ inputs.tests-apigateway-url != '' }}
      shell: bash --noprofile --norc {0}
      run: |
        echo "Setting E2E_TESTS__APIGATEWAY__URL env var to ${{ inputs.tests-apigateway-url }}"
        echo "E2E_TESTS__APIGATEWAY__URL=${{ inputs.tests-apigateway-url }}" >> $GITHUB_ENV

    - name: 🏗 Build
      run: dotnet build src --no-restore

    - name: List Files
      run: |
        ls src -lR > src/TestAutomation/TestResults/post-build-files.txt
        ls ${{ github.workspace }}
        
    - name: 🦾 Install browser for Playwright tests
      shell: pwsh
      run: src/Krukowskid.Tests.UI/Krukowskid.Tests.UI.x/bin/Debug/net7.0/playwright.ps1 install --with-deps chromium
    
    - name: 🧪 Run tests for specified filter and rerun failed
      shell: bash --noprofile --norc {0}
      env:
        LC_ALL: en_US.utf8
      run: |
        counter=1
        exitcode=0
        reset="\e[0m"
        warn="\e[0;33m"
        green="\e[0;92m"
        blue="\e[0;94m"
        while [ $counter -lt 4 ]
        do
            if [ $filter ]
            then
                echo -e "${warn}Run number: $counter. Re-running failed tests filter: $filter ${reset}"
                # run test and forward output also to a file in addition to stdout (tee command)
                cp src/TestAutomation/TestResults/runtestsoutput.log src/TestAutomation/TestResults/runtestsoutput_first.log
                dotnet test src --no-build --filter=$filter --verbosity minimal --logger trx --results-directory src/TestAutomation/TestResults --settings:src/Krukowskid.Tests.Common/Krukowskid.Tests.Common.Configuration/cicd.runsettings | tee src/TestAutomation/TestResults/runtestsoutput.log
            else
                echo -e "${blue}First run. Running tests with filter "${{ inputs.tests-filter }}" ${reset}"
                dotnet test src --no-build --filter "${{ inputs.tests-filter }}" --verbosity minimal --logger trx --results-directory src/TestAutomation/TestResults --settings:src/Krukowskid.Tests.Common/Krukowskid.Tests.Common.Configuration/cicd.runsettings | tee src/TestAutomation/TestResults/runtestsoutput.log
            fi
            # capture dotnet test exit status, different from tee
            exitcode=${PIPESTATUS[0]}
            if [ $exitcode == 0 ]
            then
                echo -e "${green}Running tests succeeded after $counter attempts.${reset}"
                exit 0
            fi
            filter=$(cat src/TestAutomation/TestResults/runtestsoutput.log | grep -o -P '(?<=\sFailed\s)\w*'| grep -v -x 'Krukowskid' | awk 'BEGIN { ORS="|" } { print("Name=" $0) }' | grep -o -P '.*(?=\|$)')
            ((counter++))
        done
        exit $exitcode

    - name: List Files
      if: always()
      run: ls src -lR > src/TestAutomation/TestResults/post-tests-files.txt
    
    - name: 📈 Generate Github Report
      uses: dorny/test-reporter@v1
      if: always()
      with:
        name: ${{ inputs.report-name }} Test Execution Report
        path: 'src/TestAutomation/TestResults/*.trx'
        reporter: 'dotnet-trx'
        list-suites: 'all'
        fail-on-error: 'false'

    - name: Find Allure Reports
      if:  ${{ always() && inputs.allure-reports == true }} 
      shell: bash
      run: |        
        find src -type d -name "allure-results"        

    - name: Copy Allure Reports
      if:  ${{ always() && inputs.allure-reports == true }} 
      shell: bash
      run: |        
        find src -type d -name "allure-results" -exec cp -r -v {}/. src/TestAutomation/TestResults/AllureReports \;
              
    - name: 📈 Upload Allure Reports
      uses: unickq/send-to-allure-docker-service-action@v1
      if:  ${{ always() && github.ref_name == 'main' && inputs.allure-reports == true }} 
      continue-on-error: true
      with:
        allure_results: src/TestAutomation/TestResults/AllureReports
        project_id: ${{ inputs.allure-project-id }}
        auth: true
        generate: true       

    - name: Upload additional reports
      uses: actions/upload-artifact@v3
      if: always()
      with:
        name: ${{ inputs.report-name }}TestReports
        path: |
          src/TestAutomation
          src/**/TestResults
          src/**/bin/**/allureConfig.json
          src/**/bin/**/appSettings.*.json

Hi @krukowskid , From the description of this issue, I see the error is thrown from Azure CLI. But in the steps of "reusable workflow definition", I can't tell which step throws the exception. Could you answer these questions for the further analysis?

  • This error is thrown from one of the Azure CLI cmd, right?
  • Could you provide the screenshot of the workflow run? (an example of the screenshot)
  • Is ubuntu-latest-8core32ram a self-hosted runner?
  • Do you konw which version of Azure CLI you're using for ubuntu-latest-8core32ram?
  • Have your tried the latest Azure CLI version?
  • This error is thrown from one of the Azure CLI cmd, right?

Its thrown in dotnet tests (🧪 Run tests for specified filter and rerun failed step) that are using DefaultAzureCredential()

  • Could you provide the screenshot of the workflow run?

image

  • Is ubuntu-latest-8core32ram a self-hosted runner?

Its github hosted (large) runner., same problem on ubuntu-latest

  • Do you konw which version of Azure CLI you're using for ubuntu-latest-8core32ram?

same as on ubuntu-latest

  • Have your tried the latest Azure CLI version?

on the day i was creating an issue 1.4.6 was the latest. I will try with 1.5.0

@krukowskid,
Azure Login Action works for Azure CLI and Azure PowerShell. But in your workflow file, Run tests for specified filter and rerun failed only call dotnet commands. Do you mean the error is thrown for your c# source code? Have you checked the code if they run the auth independently without Azure CLI?

I am using DefaultAzureCredential. Locally (with visualstudioidentity) it works, it also works with azure login action with secret

@krukowskid , What I can see from Run tests for specified filter and rerun failed is the workflow file tries to run "dotnet test". I don't know what's inside.
Azure Login Action supports Azure CLI and Azure PowerShell. If it's pure c# test codes, I don't think it'll work. If the tests call Azure CLI or Azure PowerShell, it's another story. Can you share more details with us?

In dotnet code I am using DefaultAzureCredential from Azure.Identity package. During authentication it loops trough all possible methods of authentication. When running test on runner it's using AzureCliCredential with CLI context set on runner by azure/login action

Sticking my me too on this problem, exactly the same error message and reporting of a 5 minute token. Out of curiosity, is there a point where the v1 tag should be dropped back to a previously working commit in order to avoid lots of issues? I know that best practice is that workflows should us commit hashes instead of tags when referencing actions but I'm sure there are lots of workflows that don't.

Sticking my me too on this problem, exactly the same error message and reporting of a 5 minute token. Out of curiosity, is there a point where the v1 tag should be dropped back to a previously working commit in order to avoid lots of issues? I know that best practice is that workflows should us commit hashes instead of tags when referencing actions but I'm sure there are lots of workflows that don't.

Hi @shaneholder could you please provide more details about your issues? As we know, v1.5.1 will not introduce the issues like this. We're trying to reproduce this issue and figure out how it happens now.
FYI, we would drop back v1 to an older version if the latest version truely introduces some issues, e.g. #371 . However, about moving the v1 to the latest version or not, everyone has different opinions, e.g. #380.
Let's focus on this issue itself. Please help us to provide more details to reproduce it. If it's indeed an issue, we'll take the right action on it.

I don't know why but I can't replicate it anymore. However if you are still curious on how my workflow looks like :

name: Test Workflow for Debugging Azure Cli Credentials Timeout

on:
  workflow_dispatch:

permissions:
  id-token: write
  contents: read

jobs:
  azure:
    name: "Testing Azure Cli Timeout"
    runs-on: [self-hosted, linux, x64] # ubuntu-latest
    environment: Production
    steps:
      - name: Install Azure cli
        run: |
          sudo apt-get install ca-certificates curl apt-transport-https lsb-release gnupg -y
          curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/microsoft.gpg > /dev/null
          AZ_REPO=$(lsb_release -cs)
          echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
          sudo apt-get update
          sudo apt-get install azure-cli

      - name: Az CLI login
        uses: azure/login@v1
        with:
          client-id: ${{ vars.AZURE_CLIENT_ID }}
          tenant-id: ${{ vars.AZURE_TENANT_ID }}
          allow-no-subscriptions: true

      - name: Sleep for 10 minutes
        run: sleep 600

      - name: Az CLI Account Show
        run: az account show

what I'm suspecting is that for the ubuntu runner we are using, azure cli might have been updated ? (I'm not sure which version of ubuntu we are running, but it might be that azure cli latest was not yet the right version for our distrib ?)

Scratch that I actually still face it, but my real pipeline is a bit different as it also install azure-cli-core using pip3 for some requirements with the azure ansible collection.

I wonder if it's the azure-cli-core (2.34.0) that messes up with the token expiration even though I login with the action before even installing this azure-cli-core, I am lost.

EDIT: it's not, I tested by forcing the installation of 2.55.0 with pip3 and still the same thing. I'm trying some more workflows to see if I can replicate in an isolated environment

@benjamin-rousseau-shift i think the issue is with the underlying OIDC token issued by Github (5 minutes expiry). it seems like its not a fault of Azure Cli. I've started having issues similar to yours after migrating to federated identity. I solved them:

https://stackoverflow.com/questions/77686072/issues-with-azure-identity-when-using-federated-credentials

I'm using python, but you can implement this fix in any other language:

def get_azure_credentials():
    token_request = os.environ.get("ACTIONS_ID_TOKEN_REQUEST_TOKEN")
    token_uri = os.environ.get("ACTIONS_ID_TOKEN_REQUEST_URL")
    subprocess_helper(f'token=$(curl -H "Authorization: bearer {token_request}" "{token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) && az login --service-principal -u {CLIENT_ID} -t {TENANT_ID} --federated-token $token')
    return AzureCliCredential()

@4c74356b41 By doing this I think you're basically doing exactly the same thing as the github action.
My workaround for now is to azure login again (just like you do in your python script) right before I need to fetch something from azure.
Not the fanciest solution but yeah the OIDC token are only valid 5 minutes that's a fact no matter what the documentation is saying :/

@4c74356b41 @benjamin-rousseau-shift, you are right. The GitHub OIDC provider issues a JWT ID token with a 5-minute expiration time. Its lifespan is not officially documented. By decoding the OIDC token, we can find it is actually expired in 5 minutes. You can also verify this in the sample token.

During login, Azure CLI will use the GitHub OIDC token to fetch an access token from MSAL. This access token will be stored in msal_token_cache. This access token is assigned a random value ranging between 60-90 minutes (75 minutes on average). See https://learn.microsoft.com/en-us/entra/identity-platform/access-tokens#access-token-lifetime.

AzureCliCredential() authenticates by requesting a token from the Azure CLI. The instantiation of AzureCliCredential() alone will not raise the error. The error should occur when calling its method get_token(). It executes az account get-access-token --output json --resource {} to request a token from Azure CLI. See https://github.com/Azure/azure-sdk-for-python/blob/6aa171f81c0111996a2785b14864e961a7942e87/sdk/identity/azure-identity/azure/identity/_credentials/azure_cli.py#L24.

For az account get-access-token, Azure CLI first calls acquire_token_silent to attempt to get an access token from token cache. If no access token is returned, it calls acquire_token_for_client to get a new access token with client assertion in OIDC scenario, see Azure/azure-cli#13276 (comment).

Regarding @krukowskid's issue, the error ERROR: AADSTS700024: Client assertion is not within its valid time range. is most likely because DefaultAzureCredential fails to find or accept the access token in token cache and attempts to fetch a new access token again. At this point, the GitHub OIDC token is expired and cannot be used to fetch an access token.

In my local testing, it works seamlessly under normal conditions, returning the access token from the cache without needing to fetch a new access token from MSAL. I am wondering if you use GetToken() to issue a different scope from the access token stored in token cache. You may double check the TokenRequestContext argument for DefaultAzureCredential().GetToken().

not sure if I'm interpreting what you say right. basically what you are saying that the default token in token cache should still be valid for 75 minutes on average and if we somehow retrieve that it should work (even though OIDC token expired)?

@4c74356b41, you're correct. Azure CLI stores the access token fetched from MSAL, which is valid for 75 minutes on average. If you are trying to retrieve this token from cache, it should work without the need of OIDC token. But if you are retrieving a new access token from remote MSAL, it needs OIDC token.

mkay, can you, please, help me understand how to reliably request token from the cache and not a new token?
thanks!

@4c74356b41, I tried the following python code, it will return the token form cache if it is still valid.

from azure.identity import AzureCliCredential
azure_cli_credential = AzureCliCredential()
print("AzureCliCredential: ", azure_cli_credential.get_token("https://management.core.windows.net/"))

thats what i was using and its definitely isnt working with OIDC

Just faced similar issue

In my case - workflow is quite long running scheduled job to cleanup some unwanted images from azure container registry

Here is workflow file, nothing fancy inside, technically it has only two moving parts:

  1. azure login
  2. run powershell script
cleanup.yml
name: cleanup
on:
  workflow_dispatch:
env:
  ARM_CLIENT_ID: 000000000-0000-0000-0000-000000000000
  ARM_USE_OIDC: true

permissions:
  contents: read
  id-token: write

jobs:
  cleanup:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v1
        with:
          client-id: 000000000-0000-0000-0000-000000000000
          tenant-id: 000000000-0000-0000-0000-000000000000
          subscription-id: 000000000-0000-0000-0000-000000000000
      - run: pwsh cleanup.ps1

the script itself is something like this (stripped out all irrelevant details) aka it is iterating over images and deletes them from container registry

$ErrorActionPreference = "Stop"

$registry = 'demo'
az acr login -n $registry

# Step 1: retrieve images
# pretend we received images here
$used = @('demo.azurecr.io/foo:latest', 'demo.azurecr.io/bar:1.2.0')

# Step 2: delete images
$counter = 0
foreach ($image in $items) {
  try {
    az acr repository delete -n $registry --image $image --yes --only-show-errors
    Write-Host "$image - deleted" -ForegroundColor Green
    $counter += 1
  }
  catch {
    Write-Host "$image - failed" -ForegroundColor Red
  }
  # ♻️ workaround - manually refresh token
  if ($env:ARM_CLIENT_ID -and $counter % 100 -eq 0) {
    az login --service-principal -u $env:ARM_CLIENT_ID -t (az account show --query tenantId -o tsv) --federated-token (Invoke-RestMethod -Uri "$($env:ACTIONS_ID_TOKEN_REQUEST_URL)&audience=api://AzureADTokenExchange" -Headers @{Authorization = "Bearer $($env:ACTIONS_ID_TOKEN_REQUEST_TOKEN)" } | Select-Object -ExpandProperty value)
  }
}

as you can guess because it is deleting images one by one it took some time, definitely more than 5 minutes, in my case job took 2 hour

so after a while all attempts to delete images are failed with following error:

ERROR: AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-03-14T16:07:58.2005292Z, assertion valid from 2024-03-14T15:12:23.0000000Z, expiry time of assertion 2024-03-14T15:17:23.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials . Trace ID: 849defde-0aa5-4a2f-a30d-ec73d2266000 Correlation ID: 9706d64c-2538-4e10-8808-cb3f37cb0a93 Timestamp: 2024-03-14 16:07:58Z
  Interactive authentication is needed. Please run:
  az login

so i was wondering if there is a some kind of workaround, aka az refresh or something like that 🤔

and many thanks to @4c74356b41 for pointing me out - there is, added an example of how it may be done in powershell

use this work around detailed previously:

token=$(curl -H "Authorization: bearer {token_request}" "{token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) 
az login --service-principal -u {CLIENT_ID} -t {TENANT_ID} --federated-token $token')

you can create a timer to call this every 5 minutes or you can simply do this every iteration (or every other iteration, etc)

you can also use runspaces to finish everything 10x faster or smth

We've recently been experiencing this issue, it was working fine before, and no changes have been made to the workflow.

Setup:

  • Runner using the ubuntu-latest image.
  • azure/login using OIDC login.
  • run steps calling the Azure CLI directly.

We noticed that the issue arose when the GitHub hosted runner image went from 20240324.2.0 to 20240407.1.0. The PR shows that the Azure CLI was updated from 2.58.0 to 2.59.0, see https://github.com/actions/runner-images/pull/9656/files#diff-66aec6097318276b09842a3ba2caf3037afbd8dadca2dfcdf76631100613ea69R111.

I'm not aware of nice workarounds for now, so I'll add more azure/login steps...

Same here, now experiencing it way more often... gotta put in more login steps.
Azure is slow with deploying some resources and it's just a pain in the ... to have to relog for every action.

Workaround in pwsh

                Write-Verbose -Verbose "Force refresh token" # https://github.com/Azure/login/issues/372
                $uri = "$($ENV:ACTIONS_ID_TOKEN_REQUEST_URL)&audience=api://AzureADTokenExchange"
                $reqToken = "bearer $($ENV:ACTIONS_ID_TOKEN_REQUEST_TOKEN)"

                Write-Verbose -Verbose "Get token"
                $token = Invoke-RestMethod -Method GET -Uri "$($uri)&audience=api://AzureADTokenExchange" -Headers @{ "Authorization" = "$($reqToken)" } | Select-Object -ExpandProperty value
                Write-Verbose -Verbose "Login"
                az login --service-principal -u REPLACE_W_CLIENTID -t REPLACE_W_TENANTID --federated-token $token

I am the developer of Azure CLI for federated identity credential support. Please see Azure/azure-cli#28708 (comment) for a temporary mitigation to extend the task duration to 60 minutes.

@jiasli, thanks for suggesting this workaround. I tried your suggestion in my pipeline, but still run into the same issue as before. Example run: https://github.com/microsoft/hi-ml/actions/runs/8642139946/job/23692828663, using the workflow updated like this: microsoft/hi-ml#925

Roughly speaking, in our test suite, we repeatedly run tests that

  • get a credential (AzureCliCredential or service principal)
  • run an Azure or AzureML operation using that credentials object

Despite having added various different scoped access tokens, I always eventually hit a token expiry problem

A nice solution with automatic periodic refresh has been suggested in Azure/azure-cli#28708 (comment) which you can wrap in a custom github action like show below. Can potentially be used as a temporary replacement of this action for long running workflows.

name: Azure Federated Login

inputs:
  client-id:
    description: Azure client id
    type: string
  tenant-id:
    description: Azure tenant id
    type: string
  subscription-id:
    description: Azure subscription id
    type: string
    default: none
  refresh-interval-seconds:
    description: Refresh interval in seconds
    type: number
    default: 240


runs:
  using: "composite"
  steps:
    - name: Fetch OID token every ${{ inputs.refresh-interval-seconds }} seconds
      shell: bash
      run: |
        first_time=true
        while true; do
          token=$(curl -s -H "Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" "${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=api://AzureADTokenExchange" | jq .value -r)
          az login --service-principal -u ${{ inputs.client-id }} -t ${{ inputs.tenant-id }} --federated-token $token --output none
          if [ "$first_time" = true ] && [ "${{ inputs.subscription-id }}" != "none" ]; then
            az account set -s ${{ inputs.subscription-id }}
            first_time=false
          fi
          sleep ${{ inputs.refresh-interval-seconds }}
        done &

The temporary solution does not work when using packer azure provider in hcl templates. In our case we use packer templates to create custom Azure VM Images with integrated use_azure_cli_auth: true as the mode of authentication.

source "azure-arm" "image" {
  location                               = "${var.location}"



  // Auth
  use_azure_cli_auth                     = true
  subscription_id                        = "${var.subscription_id}"

  // Rest omitted.

}

the process takes 6 hours to create fresh VM images and at the end of script when packer wants to create the final image in the azure gallery, we receive the same error:


==> azure-arm.image: authorizing request: running Azure CLI: exit status 1: ERROR: AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-04-12T06:48:06.1011631Z, assertion valid from 2024-04-12T01:04:10.0000000Z, expiry time of assertion 2024-04-12T01:14:10.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials . Trace ID: bcaf1c3c-98f2-4cb3-b0db-61aa68f15701 Correlation ID: b3cc18c0-bc01-403e-9d49-a119ac9bbc46 Timestamp: 2024-04-12 06:48:06Z
==> azure-arm.image: Interactive authentication is needed. Please run:
==> azure-arm.image: az login

Sorry to mention you @jiasli: Does your fix takes in to account such scenarios as well? basically long-running pipelines (up to 6 hours) by refreshing the access token in background by providing refresh_tokens and get access_tokens in turn?

❗ ❗ ❗If you are encountering ERROR: AADSTS700024: Client assertion is not within its valid time range, here are the workarounds for four scenarios:

  1. If your workflow fails after 5 minutes recently with azure-cli on your runner upgraded to 2.59.0:

    Workaround: Downgrade azure-cli to 2.58.0. Following are the scripts to downgrade the azure-cli version on your agent.

    • If you are using azure/cli action, specify azcliversion with an older version of Azure CLI below 2.59.0, such as 2.58.0.
      - uses: azure/cli@v2
        with:
          azcliversion: 2.58.0
          inlineScript: |
            az --version
    • If you are using other actions depending on azure-cli, downgrade azure-cli on Linux runners:
        jobs:
          linux-regression:
            runs-on: ubuntu-latest
            steps:
               - name: uninstall azure-cli 
                 run: |
                    sudo apt-get remove -y azure-cli
               - name: install azure-cli 2.58.0
                 run: |
                    sudo apt-get update
                    sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release
                    sudo mkdir -p /etc/apt/keyrings
                    curl -sLS https://packages.microsoft.com/keys/microsoft.asc |
                        sudo gpg --dearmor -o /etc/apt/keyrings/microsoft.gpg
                    sudo chmod go+r /etc/apt/keyrings/microsoft.gpg
                    AZ_DIST=$(lsb_release -cs)
                    echo "Types: deb
                    URIs: https://packages.microsoft.com/repos/azure-cli/
                    Suites: ${AZ_DIST}
                    Components: main
                    Architectures: $(dpkg --print-architecture)
                    Signed-by: /etc/apt/keyrings/microsoft.gpg" | sudo tee /etc/apt/sources.list.d/azure-cli.sources
                    AZ_VER=2.58.0
                    sudo apt-get update && sudo apt-get install azure-cli=${AZ_VER}-1~${AZ_DIST}
               - name: check azure-cli version
                 run: |
                    az --version
    • Downgrade azure-cli on Windows runners:
      jobs:
        windows-regression:
          runs-on: windows-latest
          steps:
             - name: uninstall azure-cli 
               run: |
                  Start-Process msiexec.exe -Wait -ArgumentList '/x {DEFB65A7-FD02-4710-B01E-6C9387982CA9} /quiet'
             - name: install azure-cli 2.58.0
               run: |
                  $ProgressPreference = 'SilentlyContinue'; Invoke-WebRequest -Uri https://azcliprod.blob.core.windows.net/msi/azure-cli-2.58.0-x64.msi -OutFile .\AzureCLI.msi; Start-Process msiexec.exe -Wait -ArgumentList '/I AzureCLI.msi /quiet'; Remove-Item .\AzureCLI.msi
             - name: check azure-cli version
               run: |
                  az --version

    Note that downgrading Azure CLI may take some time to finish. But this workaround is only necessary until Azure CLI 2.60.0 is released.

  2. If your workflow fails after 5 minutes also in azure-cli <= 2.58.0:

    • This is because there is no access token for your requested scope in the token cache, Azure CLI will try to get the access token with the GitHub ID token. However, as the ID token has expired after 5 minutes, you will encounter ERROR: AADSTS700024. See Azure/azure-cli#28708 (comment).
    • It is expected to be solved after azure-cli supports ID token refresh.

    Workaround: Request access token with all your required scopes within 5 minutes. Here are the most popular requested scopes. Modify the script according to your request.

      - uses: azure/cli@v2
        with:
          azcliversion: 2.58.0
          inlineScript: |
              # Storage:
              az account get-access-token --scope https://storage.azure.com/.default --output none 
              # Key Vault: 
              az account get-access-token --scope https://vault.azure.net/.default --output none
              # Microsoft Graph: 
              az account get-access-token --scope https://graph.microsoft.com/.default --output none
              # Kusto: 
              az account get-access-token --scope https://kusto.kusto.windows.net/.default --output none
  3. If your workflow fails after 60 minutes:
    This is because azure-cli can only request an access token with a lifetime of 60 minutes. But ID token has expired after 5 minutes, azure-cli cannot get a new access token after 60 minutes. It is expected to be solved after azure-cli supports ID token refresh.

    Workaround: Use user managed identities with OIDC, instead of using service principals
    The token lifetime of managed Identities would be 24 hours, see Managed identities tokens cache. This can cover the lifetime for most of the CI/CD workflows.

  4. If your workflow fails after 5 minutes with azure-powershell < 9.2:
    This is the scenario what #180 talks. It's fixed in Azure PowerShell v9.2 (released on 12/6/2022). See #180 (comment).

Check your scenario and use the provided workaround. We're actively working to resolve this issue. Thank you for your understanding.

❗ ❗ ❗If you are encountering ERROR: AADSTS700024: Client assertion is not within its valid time range, here are the workarounds for four scenarios:

  1. If your workflow fails after 5 minutes recently with azure-cli on your runner upgraded to 2.59.0:

    Workaround: Downgrade azure-cli to 2.58.0. Following are the scripts to downgrade the azure-cli version on your agent.

    • If you are using azure/cli action, specify azcliversion with an older version of Azure CLI below 2.59.0, such as 2.58.0.
      - uses: azure/cli@v2
        with:
          azcliversion: 2.58.0
          inlineScript: |
            az --version
    • If you are using other actions depending on azure-cli, downgrade azure-cli on Linux runners:
        jobs:
          linux-regression:
            runs-on: ubuntu-latest
            steps:
               - name: uninstall azure-cli 
                 run: |
                    sudo apt-get remove -y azure-cli
               - name: install azure-cli 2.58.0
                 run: |
                    sudo apt-get update
                    sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release
                    sudo mkdir -p /etc/apt/keyrings
                    curl -sLS https://packages.microsoft.com/keys/microsoft.asc |
                        sudo gpg --dearmor -o /etc/apt/keyrings/microsoft.gpg
                    sudo chmod go+r /etc/apt/keyrings/microsoft.gpg
                    AZ_DIST=$(lsb_release -cs)
                    echo "Types: deb
                    URIs: https://packages.microsoft.com/repos/azure-cli/
                    Suites: ${AZ_DIST}
                    Components: main
                    Architectures: $(dpkg --print-architecture)
                    Signed-by: /etc/apt/keyrings/microsoft.gpg" | sudo tee /etc/apt/sources.list.d/azure-cli.sources
                    AZ_VER=2.58.0
                    sudo apt-get update && sudo apt-get install azure-cli=${AZ_VER}-1~${AZ_DIST}
               - name: check azure-cli version
                 run: |
                    az --version
    • Downgrade azure-cli on Windows runners:
      jobs:
        windows-regression:
          runs-on: windows-latest
          steps:
             - name: uninstall azure-cli 
               run: |
                  Start-Process msiexec.exe -Wait -ArgumentList '/x {DEFB65A7-FD02-4710-B01E-6C9387982CA9} /quiet'
             - name: install azure-cli 2.58.0
               run: |
                  $ProgressPreference = 'SilentlyContinue'; Invoke-WebRequest -Uri https://azcliprod.blob.core.windows.net/msi/azure-cli-2.58.0-x64.msi -OutFile .\AzureCLI.msi; Start-Process msiexec.exe -Wait -ArgumentList '/I AzureCLI.msi /quiet'; Remove-Item .\AzureCLI.msi
             - name: check azure-cli version
               run: |
                  az --version

    Note that downgrading Azure CLI may take some time to finish. But this workaround is only necessary until Azure CLI 2.60.0 is released.

  2. If your workflow fails after 5 minutes also in azure-cli <= 2.58.0:

    Workaround: Request access token with all your required scopes within 5 minutes. Here are the most popular requested scopes. Modify the script according to your request.

      - uses: azure/cli@v2
        with:
          azcliversion: 2.58.0
          inlineScript: |
              # Storage:
              az account get-access-token --scope https://storage.azure.com/.default --output none 
              # Key Vault: 
              az account get-access-token --scope https://vault.azure.net/.default --output none
              # Microsoft Graph: 
              az account get-access-token --scope https://graph.microsoft.com/.default --output none
              # Kusto: 
              az account get-access-token --scope https://kusto.kusto.windows.net/.default --output none
  3. If your workflow fails after 60 minutes:
    This is because azure-cli can only request an access token with a lifetime of 60 minutes. But ID token has expired after 5 minutes, azure-cli cannot get a new access token after 60 minutes. It is expected to be solved after azure-cli supports ID token refresh.
    Workaround: Use user managed identities with OIDC, instead of using service principals
    The token lifetime of managed Identities would be 24 hours, see Managed identities tokens cache. This can cover the lifetime for most of the CI/CD workflows.

  4. If your workflow fails after 5 minutes with azure-powershell < 9.2:
    This is the scenario what ERROR: AADSTS700024: Client assertion is not within its valid time range #180 talks. It's fixed in Azure PowerShell v9.2 (released on 12/6/2022). See ERROR: AADSTS700024: Client assertion is not within its valid time range #180 (comment).

Check your scenario and use the provided workaround. We're actively working to resolve this issue. Thank you for your understanding.

hey, we are using 2.60.0 and still seeing:

ERROR: AADSTS700024: Client assertion is not within its valid time range.
Current time: 2024-05-14T03:56:38.3260093Z, assertion valid from
2024-05-14T03:32:44.0000000Z, expiry time of assertion
2024-05-14T03:37:44.0000000Z.

az version output:

{
  "azure-cli": "2.60.0",
  "azure-cli-core": "2.60.0",
  "azure-cli-telemetry": "1.1.0",
  "extensions": {
    "resource-graph": "2.1.0"
  }
}

any pointers?

Hi @4c74356b41, please review scenario 2. The Azure CLI currently does not support ID token refresh.