prometheus-community / windows_exporter

Prometheus exporter for Windows machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to install 0.25.0 on windows container .Net framework runtime

dansimov04012022 opened this issue · comments

Using windows container image mcr.microsoft.com/dotnet/framework/runtime:4.8 in docker on Windows Server 2019 Datacenter amd64 AWS EC2 instance.

Installing the latest 0.25.0 release with the next commands:

invoke-webrequest -uri https://github.com/prometheus-community/windows_exporter/releases/download/v0.25.0/windows_exporter-0.25.0-amd64.msi -usebasicparsing -outfile .\windows_exporter-0.25.0-amd64.msi
$arg = "/qn /i windows_exporter-0.25.0-amd64.msi /L*V install.log"
Start-Process msiexec.exe -Wait -ArgumentList "$arg"

The EventLog shows this:

The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.574Z caller=exporter.go:226 level=info msg="Build context" build_context="(go=go1.21.5, plat form=windows/amd64, user=runneradmin@fv-az1388-9, date=20240111-08:30:58, tags=unknown)"', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.574Z caller=exporter.go:225 level=info msg="Starting windows_exporter" version="(version=0.25.0, branch=heads/tags/v0.25.0, revision=6ede10e29aeea4e5aa250a20ade395ce14058fdc)"', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.587Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=[::]:9182', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.586Z caller=tls_config.go:313 level=info msg="Listening on" address=[::]:9182', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.421Z caller=textfile.go:103 level=info collector=textfile msg="textfile collector directories: C:\\Program Files\\windows_exporter\\textfile_inputs"', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.421Z caller=service.go:93 level=warn collector=service msg="No where-clause specified for service collector. This will generate a very large number of metrics!"', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.574Z caller=exporter.go:172 level=info msg="Enabled collectors: cs, logical_disk, physical_disk, system, textfile, cpu, net, os, service"', '', '', '', '', '', '', '', ''
The description for Event ID '3299' in Source 'windows_exporter' cannot be found.  The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.  The following information is part of the event:'ts=2024-01-13T16:29:24.563Z caller=exporter.go:165 level=info msg="Running as NT AUTHORITY\\SYSTEM"', '', '', '', '', '', '', '', ''
Ending a Windows Installer transaction: C:\tmp\windows_exporter-0.25.0-amd64.msi. Client Process Id: 1792.
Ending session 4 started 2024-01-13T16:25:52.493635300Z.
Windows Installer installed the product. Product Name: windows_exporter. Product Version: 0.25.0. Product Language: 1033. Manufacturer: prometheus-community. Installation success or error status: 1603.
Product: windows_exporter -- Error 1920. Service 'windows_exporter' (windows_exporter) failed to start.  Verify that you have sufficient privileges to start system services.
Product: windows_exporter -- Installation failed.

whoami command reports user manager\containeradministrator.

Installation debug log is attached.

install.log

How to reproduce it locally:

Dockerfile

FROM mcr.microsoft.com/dotnet/framework/runtime:4.8
SHELL ["powershell", "-NoProfile", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

ARG VERSION=0.25.0
ARG URL=https://github.com/prometheus-community/windows_exporter/releases/download/v${VERSION}/windows_exporter-${VERSION}-amd64.msi

WORKDIR C:\\temp
RUN Invoke-Webrequest -uri $env:URL -usebasicparsing -outfile "C:\temp\windows_exporter-$env:VERSION-amd64.msi"
RUN Start-Process 'msiexec' -NoNewWindow -Wait \
        -ArgumentList "/i", "c:\temp\windows_exporter-$env:VERSION-amd64.msi", "/qn", "/norestart", "/L*v", "c:\temp\install.log" 

Build & run:

docker build -t windows_exporter .
docker run -it --rm windows_exporter powershell
get-content c:\install.log

Tested installing Zabbix Agent on the same docker image with MSI. It failed to create the Firewall rule with 1603 error, but Zabbix msi has the option to skip Firewall rule creation, so with that option set it installed successfully.

Dockerfile:

FROM mcr.microsoft.com/dotnet/framework/runtime:4.8
SHELL ["powershell", "-NoProfile", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

ARG URL=https://cdn.zabbix.com/zabbix/binaries/stable/6.4/6.4.10/zabbix_agent-6.4.10-windows-amd64-openssl.msi

WORKDIR C:\\temp

RUN Invoke-Webrequest -uri $env:URL -usebasicparsing -outfile "C:\temp\zabbix_agent-6.4.10-windows-amd64-openssl.msi"
RUN Start-Process 'msiexec' -NoNewWindow -Wait \
        -ArgumentList "/i", "C:\temp\zabbix_agent-6.4.10-windows-amd64-openssl.msi", "/qn", "/norestart", "/L*v", "c:\temp\install.log", "LISTENPORT=10051", "HOSTNAME=myhost", "SERVER=127.0.0.1", "SKIP=fw"
RUN Get-Service -Name 'Zabbix Agent'

I removed the FirewallException block from the wxs file:

diff --git a/installer/windows_exporter.wxs b/installer/windows_exporter.wxs
index bf52f61..220c76c 100644
--- a/installer/windows_exporter.wxs
+++ b/installer/windows_exporter.wxs
@@ -49,11 +49,7 @@

     <ComponentGroup Id="Files">
       <Component Directory="APPLICATIONROOTDIRECTORY">
-        <File Id="windows_exporter.exe" Name="windows_exporter.exe" Source="Work\windows_exporter.exe" KeyPath="yes">
-          <fw:FirewallException Id="MetricsEndpoint" Name="windows_exporter (HTTP [LISTEN_PORT])" Description="windows_exporter HTTP endpoint" Port="[LISTEN_PORT]" Protocol="tcp" IgnoreFailure="yes">
-            <fw:RemoteAddress Value="[REMOTE_ADDR]" />
-          </fw:FirewallException>
-        </File>
+        <File Id="windows_exporter.exe" Name="windows_exporter.exe" Source="Work\windows_exporter.exe" KeyPath="yes"/>
         <ServiceInstall Id="InstallExporterService" Name="windows_exporter" DisplayName="windows_exporter" Description="Exports Prometheus metrics about the system" ErrorControl="normal" Start="auto" Type="ownProcess" Arguments="--log.file eventlog [CollectorsFlag] --web.listen-address [LISTEN_ADDR]:[LISTEN_PORT] [MetricsPathFlag] [TextfileDirsFlag] [ExtraFlags]">
           <util:ServiceConfig FirstFailureActionType="restart" SecondFailureActionType="restart" ThirdFailureActionType="restart" RestartServiceDelayInSeconds="60" />
           <ServiceDependency Id="wmiApSrv" />

And it's still failing (install1.log attached).
install1.log

What I found in the EventLog regarding its service startup:

$search = Read-Host -Prompt "Enter Search Term"; (Get-EventLog -LogName System -Source "Service Control Manager" -after (Get-Date).AddDays(-1) | Select-Object -Property TimeGenerated, EntryTy
pe, Source, Message) -match $search | Sort-Object TimeGenerated | Format-Table -AutoSize -Wrap
Enter Search Term: windows_exporter

TimeGenerated          EntryType Source                  Message
-------------          --------- ------                  -------
1/13/2024 8:10:34 PM Information Service Control Manager A service was installed in the system.

                                                         Service Name:  windows_exporter
                                                         Service File Name:  "c:\Program Files\windows_exporter\windows_exporter.exe" --log.file eventlog  --web.listen-address 127.0.0.1:9182
                                                         Service Type:  user mode service
                                                         Service Start Type:  auto start
                                                         Service Account:  LocalSystem
1/13/2024 8:11:04 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:11:04 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:11:39 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:11:39 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:12:14 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:12:14 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:12:49 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:12:49 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:13:24 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:13:24 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:13:59 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:13:59 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
1/13/2024 8:14:34 PM       Error Service Control Manager The windows_exporter service failed to start due to the following error:
                                                         %%1053
1/13/2024 8:14:34 PM       Error Service Control Manager A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.

#946 seems to be the same issue

Hi, the msi package is not designed for in-container environment. Please use the standalone binary or use pre-build docker images.

Hi, the msi package is not designed for in-container environment. Please use the standalone binary or use pre-build docker images.

Hi @jkroepke

I can successfully install any MSI package in the container, but it doesn't matter if binary can't be started with Windows service inside of that container.
My issue is the same as #946

The installation via MSI package is successful in general, however it seems that windows service manages has issues inside the container

A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.

As result, the MSI package marks this as installation error and does a rollback.

You could try to increase the timeout: https://support.site24x7.com/portal/en/kb/articles/timeout-error-or-unable-to-start-a-service-on-a-windows-server

Or run the standalone binary to avoid wired behaviors.

Yeah, I know that. The timeout is fine, and increasing it won't help as the Golang code works wrong in a way of understanding that was started by the Windows service inside of the container. As I mentioned before, the issue seems to be the same as in #946

I'm running .Net app in a Windows container on AWS ECS fargate, and I can't run windows_exporter other way than as a Windows service inside of that container.

#946 mention golang/go#56335

Once its resolved, it should work for you.

Seems like the same issue was addressed in this opentelemetry-collector PR.

They invoke svc.Run as part of the main routine which introduce another issues on Windows Hosts

See also:

There are already some issue with Windows Service, if windows_exporter runs as native server under a host and some workaround are required to avoid the timeouts.

Feel free to fork and patch windows_exporter OR take care of docker best-practices and not introduce an additional service managed inside containers.

Feel free to fork and patch windows_exporter OR take care of docker best-practices and not introduce an additional service managed inside containers.

If only I was a Golang developer :)

I'm aware of docker best practices, but still, there are also good practices like running only stateless services inside of the docker containers, which is commonly violated nowadays :) My use case is kind of like that.

Anyway, thanks for your time.