actions / runner-images

GitHub Actions runner images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Windows] mongo service will be disabled by default on August, 8th

ddobranic opened this issue · comments

Breaking changes

mongodb service on Windows Server which is currently running will be disabled

Target date

The propagation is starting on August, 8th and will take 2-3 days

The motivation for the changes

Currently, mongodb service on Windows Server is running but it should be disabled by default

Possible impact

The mongodb service is disabled and all related processes

Platforms affected

  • Azure DevOps
  • GitHub Actions

Virtual environments affected

  • Ubuntu 18.04
  • Ubuntu 20.04
  • Ubuntu 22.04
  • macOS 10.15
  • macOS 11
  • macOS 12
  • Windows Server 2019
  • Windows Server 2022

Mitigation ways

Set-Service mongodb -StartupType Automatic          
Start-Service -Name mongodb

Hi @ddobranic,

I'm currently experiencing an error that might be related.

image

CONTEXT: I'm building this image in azure devops pipeline to store it in an Azure image gallery. Based on this https://github.com/YannickRe/azuredevops-buildagents

@dsfrederic, pull the latest changes from main branch. It has been fixed in #6030

It seems to have gone past the mongoDb install. Now we're one step further and the scripts fail again. Can you help me along? Is there a method to run only on verified scripts. In other words what's the branching/tagging strategy?

image

It seems to have gone past the mongoDb install. Now we're one step further and the scripts fail again. Can you help me along? Is there a method to run only on verified scripts. In other words what's the branching/tagging strategy?

image

we make a release once a week when a new image version is being released. For your specific error it looks like the codql archive seems to be inconsistent, try again, please.

Hi - I've updated my CI today with Start-Service -Name mongodb in order to handle this change, however we've already seen a few runs fail with:

Run Start-Service -Name mongodb
Start-Service: D:\a\_temp\5bb2a[14](https://github.com/SMI/SmiServices/runs/7769010886?check_suite_focus=true#step:8:15)0-67d9-46e9-a0e4-447e48467acf.ps1:2
Line |
   2 |  Start-Service -Name mongodb
     |  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     | Service 'MongoDB Server (MongoDB) (mongodb)' cannot be started due to the following error: Cannot
     | start service 'mongodb' on computer '.'.

Error: Process completed with exit code 1.

(https://github.com/SMI/SmiServices/runs/7769010886?check_suite_focus=true#step:8:1)

Any ideas? Thanks.

@rkm, it should be:

- run: |
        Set-Service mongodb -StartupType Automatic          
        Start-Service -Name mongodb

@al-cheb Have tried this now, but still encountering the issue

@al-cheb Have tried this now, but still encountering the issue

Please provide a link to the build.

@al-cheb Have tried this now, but still encountering the issue

Please provide a link to the build.

Here's our most recent instance https://github.com/SMI/SmiServices/runs/7771359161?check_suite_focus=true

@jas88, just for curiosity, what happens when you move this step right after the checkout step?

@al-cheb Tried it; both runs completed OK this time, but that doesn't say much since most but not all of the last few runs worked...

Both images have been deployed with a disabled mongo service by default.

As documented in https://github.com/orgs/community/discussions/30083, I encounter the same issue as @rkm described in #5949 (comment):

Run Set-Service mongodb -StartupType Automatic          
  Set-Service mongodb -StartupType Automatic          
  Start-Service -Name mongodb
  shell: C:\Program Files\PowerShell\7\pwsh.EXE -command ". '{0}'"
Start-Service: D:\a\_temp\d215d86b-60cc-42b6-9911-58c92d5c02a6.ps1:3
   3 |  Start-Service -Name mongodb
     |  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     | Service 'MongoDB Server (MongoDB) (mongodb)' cannot be started due to the following error: Cannot
     | start service 'mongodb' on computer '.'.
Error: Process completed with exit code 1.

This happens using the instructions provided by @al-cheb, with this step:

      - name: Start MongoDB (Windows)
        run: |
          Set-Service mongodb -StartupType Automatic          
          Start-Service -Name mongodb
        if: ${{ env.RUNNER_OS }} == 'Windows'

And yields the exact same result as what I had before finding this issue, that is using only Start-Service -Name "MongoDB".

What can we do to bring Mongo back to Windows? Currently, the only option left seems to be to disable MongoDB-related tests in CI, which plays strongly against GitHub Actions.

Note on process

I am a bit confused about how I can get notified of such major, breaking changes. Our builds started failing on Windows with no clear reason, and it took us two half days to identify the root cause (MongoDB not being started anymore on Windows) since it coincided with a major MongoDB release and we did not suspect that GitHub Actions would change its infrastructure this way. There is no documentation on how to start the service in the image description nor in the Actions doc. This issue itself was referred to me by a colleague, and I did not manage to find it through different searches. In the end, the proposed workaround does not seem to work.

More generally, this issue was opened on 22/07 for deployment on 08/08, and the rationale for such a breaking change with no clear workaround at the time of writing is “mongodb service on Windows Server is running but it should be disabled by default”. This does not make it very clear why we have to pay such a strong adaptation cost on short notice.

What is the way we, as users, are expected to follow with such breaking changes? The volume of issues makes it impractical to watch the repository. Should we check this repository weekly for “Announcement”-tagged issues? 🙂

@MattiSG, could you add the start mongodb service step right after checkout or the first one?

Thank you @al-cheb for your quick reply! I thought I got it to work by adjusting your instructions to use the case-sensitive version of the service name (with MongoDB and not mongodb as the service name). However, I now am in the same situation as described by @jas88, where one run passed and the next run failed 😕

could you add the start mongodb service step right after checkout or the first one?

Done in this run. It took two minutes of waiting before the step finally output “WARNING: Waiting for service 'MongoDB Server (MongoDB) (MongoDB)' to start...”, but it passed. However, I don't know how reliable a positive result can be considering my and @jas88’s experience with random results 😕 I will run a few more iterations and report.

@al-cheb Could you please clarify the rationale behind starting the service as early as possible in the process? I assume this is because it might take some time for the server to start, and our tests might fail earlier than that. However, in our case, dependencies are installed and a first series of tests are executed in-between the service start and the usage of MongoDB, averaging over a minute, which should be enough for the MongoDB server to be up.

I see that @ankane used:

sc config MongoDB start= auto
sc start MongoDB

since I am not familiar with Windows services, is there any reason why this could work if Service-Start fails? 🙂

I confirm that on a second run, even as “first step after checkout”, the service start fails.

@MattiSG, Could you check?

- run: |
    sc.exe config MongoDB start= auto
    sc.exe start MongoDB

With sc.exe instead of Service-Start, I had 3 successful attempts out of 4. So far, this is the most reliable way to start MongoDB on Windows. I will rebase the branch and add another few refactors, that should trigger additional builds. I will report here if additional failures are encountered.

On a first run, the service did start and logged:

SERVICE_NAME: MongoDB 
        TYPE               : 10  WIN32_OWN_PROCESS  
        STATE              : 2  START_PENDING 
                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x7d0
        PID                : 3884
        FLAGS              : 

However, all sorts of weird, never-seen-before errors appeared in the run. Those did not seem to be directly related to MongoDB.

On a second run, the service started and logged the exact same lines (except for the PID of course) and all tests passed.

On a third run, same as second.

On a fourth run, same as second.

The sc.exe workaround seems the most reliable until now. However, since its introduction, we regularly get failures on non-MongoDB-related parts of the test suite. These are random and did not happen before.

The duration for starting up the MongoDB service is between 1 and 3 minutes, leading to a significant slowdown of our pipeline:

Screen Shot 2022-08-23 at 15 59 08

Following this change, we are considering disabling MongoDB tests on Windows. This is disappointing and plays against GitHub Actions as a cross-platform CI engine.

Stability report after running 10 times each syntax

The following syntaxes were run in GitHub Actions on windows-2022 10 times in a row. They all show about 90% success, tending to demonstrate there is no difference in failure rate depending on syntax.

The test suite itself is not inherently flaky, as each run was also run in ubuntu-latest and was successful every time.

In almost two years of running this test suite cross-platform, Windows tests had only one moment of diverging failure a few months ago. This observed increase to 10% failure rate seems to be a consequence of the change introduced here.

I will report in a few days or weeks if this data changes.

sc.exe : 9 successes out of 10 runs

sc.exe config MongoDB start= auto
sc.exe start MongoDB

9/10 success

Start-Service: 9 successes out of 10 runs

Set-Service MongoDB -StartupType Automatic          
Start-Service -Name MongoDB

9/10 success

sc over JavaScript: 8 successes out of 10 runs

run(`sc config MongoDB start= auto`);
run(`sc start MongoDB`);

Screen Shot 2022-08-24 at 12 04 40

This issue seems to have resurfaced for us unfortunately. Example run here.

Our workflow file is unchanged, and contains:

      - name: "[windows] start MongoDB service"
        if: ${{ matrix.os == 'windows' }}
        shell: pwsh
        run: |
          Set-Service mongodb -StartupType Automatic
          Start-Service -Name mongodb
commented

This has also recently restarted for a project I'm working on.

@al-cheb is there any more information we could provide to help diagnose this issue?

@rkm , could you please check?

- run: |
    sc.exe config MongoDB start= auto
    sc.exe start MongoDB

@ddobranic We've changed to the sc.exe method, but it now occasionally fails with:

[SC] ChangeServiceConfig SUCCESS
[SC] StartService FAILED 1053:

The service did not respond to the start or control request in a timely fashion.

Example run: https://github.com/SMI/SmiServices/actions/runs/4563723915/jobs/8052574736?pr=1489#step:4:17