cloudflare / cloudflared

Cloudflare Tunnel client (formerly Argo Tunnel)

Home Page:https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/install-and-setup/tunnel-guide

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

šŸ›2024.1.1 did defaults regarding "post-quantum tunnels" change?

fritz-net opened this issue Ā· comments

Describe the bug
I'm running cloudflare/cloudflared:1458-c1d8c5e960ed and auto update was enabled by default, therefor it updated to 2024.1.1 today.
Now i get You are hitting an error while using the experimental post-quantum tunnels feature. but no real error message, while also the tunnel stopped working

To Reproduce
config is default, just setting container commands like this:

      containers:
      - command:
        - cloudflared
        - tunnel
        - --metrics
        - 0.0.0.0:2000
        - run
        args:
        - --token
        - $(TOKEN)

Expected behavior
tunnel keep working after auto update

Environment and versions

  • OS: k8s linux
  • Architecture: amd 64bit
  • Version: 2024.1.1

Logs and errors

2024-01-11T10:31:54Z WRN ICMP proxy feature is disabled error="cannot create ICMPv4 proxy: Group ID 65532 is not between ping group 1 to 0 nor ICMPv6 proxy: socket: permission denied"
2024-01-11T10:31:54Z INF Starting metrics server on [::]:2000/metrics
2024-01-11T10:31:54Z INF Registered tunnel connection connIndex=0 connection=cc1364fc-94f0-46e4-b9ed-f2a793f028c5 event=0 ip=**ip1** location=bru01 protocol=quic
2024-01-11T10:31:55Z INF Registered tunnel connection connIndex=1 connection=a5f33407-fb03-4fc7-9913-db05cb162b8c event=0 ip=**ip2** location=fra03 protocol=quic
2024-01-11T10:31:55Z INF Updated to new configuration config="{\"ingress\":[{\"hostname\":\"demo.**domain**\",\"id\":\"0\",\"originRequest\":{\"access\":{\"audTag\":[],\"required\":false,\"teamName\":\"**teamname**\"}},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"prod.**domain**\",\"originRequest\":{},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"prod.internal.**domain**\",\"id\":\"1\",\"originRequest\":{},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"kb.**domain**\",\"id\":\"3\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"},\"noTLSVerify\":true},\"service\":\"https://elastic-kb-http.preprod.svc:5601\"},{\"hostname\":\"pg.**domain**\",\"id\":\"4\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"}},\"service\":\"http://staging-backend-pgadmin-service.preprod.svc.cluster.local\"},{\"hostname\":\"bin.**domain**\",\"id\":\"5\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"}},\"service\":\"http://web-service\"},{\"hostname\":\"*.**domain**\",\"id\":\"6\",\"originRequest\":{\"httpHostHeader\":\"prod.**domain**\"},\"service\":\"https://ingress-nginx-controller.ingress-nginx.svc\"},{\"service\":\"https://ingress-nginx-controller.ingress-nginx.svc\"}],\"warp-routing\":{\"enabled\":false}}" version=49
2024-01-11T10:31:55Z INF cloudflared has been updated version=2024.1.1
2024-01-11T10:31:55Z INF Restarting service managed by SysV...
2024-01-11T10:31:55Z INF PID of the new process is 18
2024-01-11T10:31:55Z ERR Initiating shutdown error="cloudflared has been updated to version 2024.1.1"
2024-01-11T10:31:55Z INF Metrics server stopped
2024-01-11T10:31:55Z INF Unregistered tunnel connection connIndex=1 event=0 ip=**ip2**
2024-01-11T10:31:55Z ERR writing release: Application error 0x0 (local)
2024-01-11T10:31:55Z INF Unregistered tunnel connection connIndex=0 event=0 ip=**ip1**
2024-01-11T10:31:55Z WRN Failed to serve quic connection error="context canceled" connIndex=0 event=0 ip=**ip1**
2024-01-11T10:31:55Z ERR Failed to serve quic connection error="context canceled" connIndex=1 event=0 ip=**ip2**
2024-01-11T10:31:55Z INF Starting tunnel tunnelID=68af00c8-12c6-49d3-a5f5-a1a78f3a6cb8
2024-01-11T10:31:55Z INF Version 2024.1.1
2024-01-11T10:31:55Z INF GOOS: linux, GOVersion: go1.21.5, GoArch: amd64
2024-01-11T10:31:55Z INF Settings: map[metrics:0.0.0.0:2000 token:*****]
2024-01-11T10:31:55Z INF Autoupdate frequency is set autoupdateFreq=86400000
2024-01-11T10:31:55Z INF Generated Connector ID: 2206368a-e131-4671-a835-687a7b187739
2024-01-11T10:31:55Z INF Initial protocol quic
2024-01-11T10:31:55Z INF ICMP proxy will use **ip7** as source for IPv4
2024-01-11T10:31:55Z INF ICMP proxy will use **ipv6_1** in zone eth0 as source for IPv6
2024-01-11T10:31:55Z WRN The user running cloudflared process has a GID (group ID) that is not within ping_group_range. You might need to add that user to a group within that range, or instead update the range to encompass a group the user is already in by modifying /proc/sys/net/ipv4/ping_group_range. Otherwise cloudflared will not be able to ping this network error="Group ID 65532 is not between ping group 1 to 0"
2024-01-11T10:31:55Z WRN ICMP proxy feature is disabled error="cannot create ICMPv4 proxy: Group ID 65532 is not between ping group 1 to 0 nor ICMPv6 proxy: socket: permission denied"
2024-01-11T10:31:55Z INF Starting metrics server on [::]:2000/metrics
2024-01-11T10:31:55Z INF 

===================================================================================
You are hitting an error while using the experimental post-quantum tunnels feature.

Please check:

   https://pqtunnels.cloudflareresearch.com

for known problems.
===================================================================================


2024-01-11T10:31:55Z ERR Failed to create new quic connection error="failed to dial to edge with quic: context canceled" connIndex=2 event=0 ip=**ip3**
2024-01-11T10:31:55Z INF Registered tunnel connection connIndex=0 connection=4d84e259-ca7b-4d75-9179-519529c6eee3 event=0 ip=**ip3** location=fra07 protocol=quic
2024-01-11T10:31:56Z INF Updated to new configuration config="{\"ingress\":[{\"hostname\":\"****\",\"id\":\"0\",\"originRequest\":{\"access\":{\"audTag\":[],\"required\":false,\"teamName\":\"**teamname**\"}},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"prod.**domain**\",\"originRequest\":{},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"prod.internal.**domain**\",\"id\":\"1\",\"originRequest\":{},\"service\":\"http://ingress-nginx-controller.ingress-nginx.svc\"},{\"hostname\":\"kb.**domain**\",\"id\":\"3\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"},\"noTLSVerify\":true},\"service\":\"https://elastic-kb-http.preprod.svc:5601\"},{\"hostname\":\"pg.**domain**\",\"id\":\"4\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"}},\"service\":\"http://staging-backend-pgadmin-service.preprod.svc.cluster.local\"},{\"hostname\":\"bin.**domain**\",\"id\":\"5\",\"originRequest\":{\"access\":{\"required\":true,\"teamName\":\"**teamname**\"}},\"service\":\"http://web-service\"},{\"hostname\":\"*.**domain**\",\"id\":\"6\",\"originRequest\":{\"httpHostHeader\":\"prod.**domain**\"},\"service\":\"https://ingress-nginx-controller.ingress-nginx.svc\"},{\"service\":\"https://ingress-nginx-controller.ingress-nginx.svc\"}],\"warp-routing\":{\"enabled\":false}}" version=49
2024-01-11T10:31:56Z INF Registered tunnel connection connIndex=1 connection=5af9ce80-cbc0-4bfc-96ab-7f29caa970a0 event=0 ip=**ip4** location=bru01 protocol=quic
2024-01-11T10:31:56Z ERR Failed to create new quic connection error="failed to dial to edge with quic: context canceled" connIndex=3 event=0 ip=**ip5**
2024-01-11T10:31:57Z INF Registered tunnel connection connIndex=2 connection=89062dc1-5c4c-48da-95ed-4eda107d6adc event=0 ip=**ip2** location=fra08 protocol=quic
2024-01-11T10:31:57Z INF Tunnel server stopped

workaround
I appended --no-autoupdate to the commands and it works as before

This might be related with an issue on quic-go ECN Support, which doesn't work for all kind of environments. We will be releasing a hotfix today that disables ECN by default until there is a permanent fix for it since it is only an optimization, introduced in the latest bump of quic-go that we did.

Regardless, using docker images with auto-update flag is not recommended and is a big red flag. Please use a system that upgrades your docker image instead of relying on cloudflared auto-update (e.g: cronjob, watchtower, cloudflared image with the "latest" tag, etc.). That's because docker images are immutable so upgrading the binary inside of it is not a safe thing to do, also if we change any environment variable or other thing in the image that might be required for cloudflared to properly run, your old image won't be able to run the new versions.

Regardless, using docker images with auto-update flag is not recommended and is a big red flag.

The helm chart cloudflare-tunnel-remote uses the flag but without the boolean it's set to false by default. It still updated our versions automatically even though the flag is there.

https://github.com/cloudflare/helm-charts/blob/main/charts/cloudflare-tunnel-remote/templates/deployment.yaml

@connors2015 Hi, could you share any logs relative to that upgrade. The code only checks if the flag is present or not, it doesn't look for false or true. Therefore, it isn't expected that the helm chart command allows the binary to auto-update in the container.

@connors2015 Hi, could you share any logs relative to that upgrade. The code only checks if the flag is present or not, it doesn't look for false or true. Therefore, it isn't expected that the helm chart command allows the binary to auto-update in the container.

https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/configure-tunnels/tunnel-run-parameters/#no-autoupdate

In the cloudflare documentation it says without the boolean it defaults to false?

I have some logs from the pod but not from the first crash since the pod kept restarting until the updated an hour ago.

2024-01-11T18:57:48Z INF Starting tunnel tunnelID=xxxx-xxx-xxxx
2024-01-11T18:57:48Z INF Version 2023.10.0
2024-01-11T18:57:48Z INF GOOS: linux, GOVersion: go1.20.6, GoArch: amd64
2024-01-11T18:57:48Z INF Settings: map[metrics:0.0.0.0:2000]
2024-01-11T18:57:48Z INF Environmental variables map[TUNNEL_TOKEN:*****]
2024-01-11T18:57:48Z INF Generated Connector ID: xxxxx-xxxx-xxx
2024-01-11T18:57:48Z INF Autoupdate frequency is set autoupdateFreq=86400000
2024-01-11T18:57:48Z INF Initial protocol quic
2024-01-11T18:57:48Z INF ICMP proxy will use 10.30.0.98 as source for IPv4
2024-01-11T18:57:48Z INF ICMP proxy will use xxxxx::xxxxxx::xxxxxx::xxxxx in zone eth0 as source for IPv6
2024-01-11T18:57:48Z WRN The user running cloudflared process has a GID (group ID) that is not within ping_group_range. You might need to add that user to a group within that range, or instead update the range to encompass a group the user is already in by modifying /proc/sys/net/ipv4/ping_group_range. Otherwise cloudflared will not be able to ping this network error="Group ID 65532 is not between ping group 1 to 0"
2024-01-11T18:57:48Z WRN ICMP proxy feature is disabled error="cannot create ICMPv4 proxy: Group ID 65532 is not between ping group 1 to 0 nor ICMPv6 proxy: socket: permission denied"
2024-01-11T18:57:49Z INF Starting metrics server on [::]:2000/metrics
2024/01/11 18:57:49 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
2024-01-11T18:57:49Z INF cloudflared has been updated version=2024.1.2
2024-01-11T18:57:49Z INF Restarting service managed by SysV...
2024-01-11T18:57:49Z INF PID of the new process is 16
2024-01-11T18:57:49Z ERR Initiating shutdown error="cloudflared has been updated to version 2024.1.2"
2024-01-11T18:57:49Z ERR Failed to serve quic connection error="context canceled" connIndex=0 event=0 ip=000.000.000.000
2024-01-11T18:57:49Z INF Tunnel server stopped
2024-01-11T18:57:49Z INF Starting tunnel tunnelID=xxxxxxx-xxxxxxx-xxxxxxx-xxxxx
2024-01-11T18:57:49Z INF Version 2024.1.2
2024-01-11T18:57:49Z INF GOOS: linux, GOVersion: go1.21.5, GoArch: amd64
2024-01-11T18:57:49Z INF Settings: map[metrics:0.0.0.0:2000]
2024-01-11T18:57:49Z INF Environmental variables map[TUNNEL_TOKEN:*****]
2024-01-11T18:57:49Z INF Generated Connector ID: xxxxxxx-xxxxxxx-xxxxxx-xxxx
2024-01-11T18:57:49Z INF Autoupdate frequency is set autoupdateFreq=86400000
2024-01-11T18:57:49Z INF Initial protocol quic
2024-01-11T18:57:49Z INF ICMP proxy will use 10.30.0.98 as source for IPv4
2024-01-11T18:57:49Z INF ICMP proxy will use xxxx::xxxx::xxxx::xxxx in zone eth0 as source for IPv6
2024-01-11T18:57:49Z WRN The user running cloudflared process has a GID (group ID) that is not within ping_group_range. You might need to add that user to a group within that range, or instead update the range to encompass a group the user is already in by modifying /proc/sys/net/ipv4/ping_group_range. Otherwise cloudflared will not be able to ping this network error="Group ID 65532 is not between ping group 1 to 0"
2024-01-11T18:57:49Z WRN ICMP proxy feature is disabled error="cannot create ICMPv4 proxy: Group ID 65532 is not between ping group 1 to 0 nor ICMPv6 proxy: socket: permission denied"
2024-01-11T18:57:49Z INF Starting metrics server on [::]:2000/metrics
2024/01/11 18:57:49 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details.
2024-01-11T18:57:50Z INF Registered tunnel connection connIndex=0 connection=xxxx-xxxxxx-xxxxxx-xxxxx event=0 ip=000.000.000.000 location=iad03 protocol=quic
2024-01-11T18:57:50Z INF Metrics server stopped
cloudflared has been updated to version 2024.1.2

This is running the default values.yaml with only the tunnel_token variable set.

*removed IPs and IDs

This block in the cloudflared code for the no-autoupdate flag seems to indicate that the value for the boolean flag will be false unless explicitly stated otherwise,

altsrc.NewBoolFlag(&cli.BoolFlag{

Going a little deeper into the altsrc package we can see the Apply() method used by the NewBoolFlag() method also indicates the same,

https://github.com/urfave/cli/blob/7656c5fb838ca8a6febca43100147d317b544fd3/altsrc/flag_generated.go#L19

Please let me know if you determine this to be the case as well.

Hi @connors2015, just tried the code with a log to verify the behavior and it is the one I mentioned:
Without flag:

 ./cloudflared tunnel --url http://locahost:8888
2024-01-12T13:06:36Z INF Thank you for trying Cloudflare Tunnel(...)
2024-01-12T13:06:36Z INF Requesting new quick Tunnel on trycloudflare.com...
(...)
2024-01-12T13:06:38Z INF Auto update disabled: false

With flag

./cloudflared --no-autoupdate tunnel --url http://locahost:8888
2024-01-12T13:07:12Z INF Thank you for trying Cloudflare Tunnel(...)
2024-01-12T13:07:12Z INF Requesting new quick Tunnel on trycloudflare.com...
(...)
2024-01-12T13:07:15Z INF Auto update disabled: true

This is because we use the method cli.context.Bool:

autoupdater := updater.NewAutoUpdater(
c.Bool("no-autoupdate"), c.Duration("autoupdate-freq"), &listeners, log,
)

Which only checks if the flag is there or not:
https://github.com/urfave/cli/blob/7656c5fb838ca8a6febca43100147d317b544fd3/flag_bool.go#L161

Will close this issue since the problem isn't with cloudflared but with the way the image is being set up without the --no-autoupdate flag. Feel free to reach out again if you have any doubts.