telepresenceio / telepresence

Local development against a remote Kubernetes or OpenShift cluster

Home Page:https://www.telepresence.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Traffic-agent sidecar does not appear when traffic-manager is installed in an istio-enabled namespace

alextricity25 opened this issue · comments

Describe the bug
Hey guys 👋. I have a pod which I am trying to intercept. It runs on an environment with Istio configured. Normally, this pod has two containers. The container running my app code, and the istio proxy sidecar. When I intercept the pod, I see that the traffic manager is spinning up a new pod, but the traffic agent container never appears. The pod should now have three pods, correct? 1. app 2. istio sidecar 3. traffic agent. However, I only see two pods running, and no traffic agent :/

I looked at the traffc-manager logs(running in debug mode), and these messages stick out to me:

traffic-manager 2024/03/29 20:59:57 http: TLS handshake error from 127.0.0.6:50859: EOF
traffic-manager 2024/03/29 20:59:57 http: TLS handshake error from 127.0.0.6:38917: EOF
traffic-manager 2024/03/29 20:59:58 http: TLS handshake error from 127.0.0.6:37491: EOF
traffic-manager 2024/03/29 21:00:02 http: TLS handshake error from 127.0.0.6:54201: EOF
traffic-manager 2024-03-29 21:00:03.1760 info    httpd/conn=127.0.0.1:8081 : Warning Unhealthy Readiness probe faile
d: HTTP probe failed with statuscode: 503 : session_id="2342f802-490b-48dd-944d-b3813e1a24d6"
traffic-manager 2024-03-29 21:00:04.4588 debug   podWatcher calling updateSubnets with [10.252.0.0/23]
traffic-manager 2024-03-29 21:00:05.1666 info    httpd/conn=127.0.0.1:8081 : Warning Unhealthy Readiness probe faile
d: HTTP probe failed with statuscode: 503 : session_id="2342f802-490b-48dd-944d-b3813e1a24d6"
traffic-manager 2024-03-29 21:00:07.1741 info    httpd/conn=127.0.0.1:8081 : Warning Unhealthy Readiness probe faile
d: HTTP probe failed with statuscode: 503 : session_id="2342f802-490b-48dd-944d-b3813e1a24d6"

I don't see anything else that may be glaring.

Some other things to note:

When I run the traffic manager outside my service mesh (so that the traffic manager doesn't have a istio sidecar), that works just fine, but my app runs extremely slow. So I am now trying to run telepresence in the same namespace as my app, which has the istio service mesh configured.

telepresence_logs.zip

To Reproduce
Steps to reproduce the behavior:

  1. When I run telepresence intercept xrdm-portal --port 80:80 --docker-build ./devops/local-development --docker-build-opt file=./devops/local-development/Dockerfile.portal-web-watch -- --rm --name blah -e WATCH=true -v ./apps/portal:/app/ IMAGE
  2. I see
Connected to context vcluster_vcluster-d4400caa_telepresence-04-29-03-vcluster_gke_xrdm-dev_us-central1_shared-review-cluster-7039273, namespace xrdm (https://telepresence-04-29-03-shared-cluster.xrdm.dev)
telepresence intercept: error: connector.CreateIntercept: request timed out while waiting for agent xrdm-portal.xrdm to arrive: Events that may be relevant:
AGE     TYPE      REASON      OBJECT                             MESSAGE
1m54s   Warning   Unhealthy   pod/xrdm-portal-58fb57497f-zsp7n   Readiness probe failed: HTTP probe failed with statuscode: 503
1m54s   Warning   Unhealthy   pod/xrdm-portal-58fb57497f-zsp7n   Readiness probe failed: HTTP probe failed with statuscode: 503
1m54s   Warning   Unhealthy   pod/xrdm-portal-58fb57497f-zsp7n   Readiness probe failed: HTTP probe failed with statuscode: 503

  1. Inspect the traffic manager logs in the cluster
  2. See error

Expected behavior
The pod running my app would run with three containers.

  1. Istio proxy
  2. my app code
  3. traffic agent

Versions (please complete the following information):

  • Output of telepresence version
OSS Client             : v2.18.0
OSS Daemon in container: v2.18.0
Traffic Manager        : v2.18.0
Traffic Agent          : not reported by traffic-manager
  • Operating system of workstation running telepresence commands
macOS 14.3.1
  • Kubernetes environment and Version [e.g. Minikube, bare metal, Google Kubernetes Engine]
1.27.8-gke.1067004

I've also tried this with the Trial Plan on v2.19.0

Client             : v2.19.0
Daemon in container: v2.19.0
Traffic Manager    : v2.19.0
Traffic Agent      : docker.io/datawire/ambassador-telepresence-agent:1.14.5

It seems like running the traffic-manager with an istio-proxy sidecar is what causes this issue. When I remove the istio-proxy side car from the traffic-manager, then telepresence sets up the agent sidecar on my app pod successfully.

Hi @alextricity25, you’re getting a readiness probe error which is probably related to the Istio sidecar. Its interesting that it works outside the mesh, but once inside the mesh you start getting these probe failures. Have you configured this? If not, can you try it and see if it helps? It should help integrate the traffic manager into Istio.

Hi @cindymullins-dw,

Thank you for your reply. Yes, I've configured the serviceMesh to be of type istio when installing the helm chart. My service also uses symbolic ports. Here are all of my helm chart's values for reference:

agent:
  image:
    name: ambassador-telepresence-agent
ambassador-agent:
  enabled: false
image:
  registry: docker.io/datawire
  tag: 2.19.0
systemaHost: app.getambassador.io
systemaPort: "443"
trafficManager:
  serviceMesh:
    type: istio

@alextricity25 the logs you provided are client-only. As such, they don't tell us anything about what's going on with the traffic-agent. Would it be possible for you to include the logs from the traffic-manager and the failing pod? Also, using Helm value logLevel=debug while producing those logs would be helpful.

Closing this due to lack of response.

Apologies for not getting back to you with this. I haven't been able to spin up a new environment to test this yet, but once I get the chance I'll post the results here!