telepresenceio / telepresence

Local development against a remote Kubernetes or OpenShift cluster

Home Page:https://www.telepresence.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Root Daemon: Not running

phooijenga opened this issue · comments

commented

Describe the bug

The telepresence root daemon does not start. The daemon.log file is empty.

To Reproduce

  1. Run telepresence connect. It tells me it needs root privileges, which I provide:
    $ telepresence connect
    Launching Telepresence Root Daemon
    Need root privileges to run: /Users/paul/bin/telepresence daemon-foreground /Users/paul/Library/Logs/telepresence '/Users/paul/Library/Application Support/telepresence'
  2. Run telepresence status, observe that root daemon is not running
    $ telepresence status
    OSS User Daemon: Running
      Version           : v2.18.0
      Executable        : /Users/paul/bin/telepresence
      Install ID        : b4931622-dabf-46bc-8218-266d4b782476
      Status            : Connected
      Kubernetes server : https://198.19.249.184:6443
      Kubernetes context: founda-k3s-1
      Namespace         : apps
      Manager namespace : ambassador
      Intercepts        : 0 total
    Root Daemon: Not running
    OSS Traffic Manager: Connected
      Version      : v2.18.0
      Traffic Agent: docker.io/datawire/tel2:2.18.0

I don't see the daemon-foreground in ps output. When I run it manually it doesn't seem to crash (and writes a startup message to daemon.log), but telepresence status still reports 'not running'. It does create a /var/run/telepresence-daemon.socket.

Expected behavior

A clear and concise description of what you expected to happen.

Versions (please complete the following information):

OSS Client         : v2.18.0
OSS Root Daemon    : v2.18.0
OSS User Daemon    : v2.18.0
OSS Traffic Manager: v2.18.0
Traffic Agent      : docker.io/datawire/tel2:2.18.0

macOS Sonoma 14.4.1 (23E224)

Additional context

It appears as if this issue started happening after upgrading to macOS Sonoma 14.4.1.

daemon.log

Is this amd64 or arm64 (M1)?

commented

M1.

$ arch
arm64
$ file `which telepresence`
/Users/paul/bin/telepresence: Mach-O 64-bit executable arm64
commented

I did some debugging, and it turns out that EnsureUserDaemon swallows the error returned by ensureRootDaemonRunning here.

In my case, the error is "daemon service did not start: timeout while waiting for daemon to start", which unfortunately does not tell us anything new.

commented

So, it turns out that this system has timestamp_timeout=0 configured, and running sudo true doesn't actually do anything.

Apparently timestamp_timeout=0 is now company policy, so I can't simply change it.

commented

Alright, to wrap this all up: if I manually start the root daemon with sudo before running telepresence connect, it works.

Thanks for the info. Any ideas on how we can improve how this is handled in Telepresence?

commented

I think not hiding the error is a good start (#3559), but I'm not sure if the underlying problem can be solved completely. Maybe ensureRootDaemonRunning could check if the process is still alive as well as trying to connect to the socket. That way the user wouldn't have to wait the full 10 seconds to be told the daemon failed to start.

Another possibility (which I've not extensively tested) might be to run sudo --list (instead of sudo true) and check for timestamp_timeout=0 in the output. If it's there, telepresence can instruct the user how to run the daemon themself.
Another possibility would be to use sudo --non-interactive --no-update --validate to check if the user's cached credentials are valid (or no authentication is required) twice, once before prompting (instead of the current sudo --non-interactive true) and once again after to make sure the credentials are indeed cached.

It looks like the error display has been addressed. I'll leave this open as a feature request for the process check suggestions.