aws / aws-app-mesh-examples

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Having 502 or https redirect loop (hard-mesh-time)

msaustral opened this issue · comments

Hi @jlbutler we saw you video on eksworkshop.com and we thnik you are our last hoppe.

We have more that 13 day with a (paid support) ticket open trying to set Aws App Mesh on our EKS, we have solved some issues but still not light at the end of the tunnel to get it work.

What we have?

We do have a cluster on EKS with a ALB -> pointing to a Nginx service ->then Php-fpm set on the Nginx by fastcgi proxy and a memcache service that can be access by Nginx or PHP

Frontend: ALB -> Nginx (8080) Http1
Backend: Php-fpm (9000) / Memcache (11211)

All Bitnami images

The ALB is set to redirect traffic from 80 to 443 https and to the Nginx node port 8080 by a service with cluster IP

Everything works perfect without Mesh, so we will skipt the app yaml

Before the App Mesh Yaml, some test result:

when we try traffic https to the domain, we gets a 302 loop and say too many redirects, this happen accessing the domain on the browser or doing curl on a container inside the namespace.

so at this point we have response from Nginx service and Php but it gets looping on a 302 because the mesh does not have TLS active.

when we activate TLS on Nginx VN, php VN, memcache VN, virtual gateway and set all client Policy, then we get 502 and the php service give an error (error log on the nginx): Connection reset by peer, and we guess that the php service does not support TLS. (no log on the php service)

if we do curl to the php pod ip port 9000 we get the same error, without TLS active on the mesh we get Empty reply from server (at least a response)

so we are stock at this point were we get response from Nginx but Nginx can not communicate with PHP-fpm

there are no errors on the envoy containers of each service or the app mesh controller

the service with mesh are these:

ALB -> ingress (envoy) dep service with cluster ip -> virtual gateway -> virtual gateway route -> virtual route** -> virtual node -> virtual service-> dep. service -> pods

** Support told us that we can take out all virtual routes as the traffic will be route by the virtual gateway, but it is not working, it only route the traffic for the service added to the virtual gateway so the Nginx

if we take out all virtual route we start to get error on the envoy that is on the php pod saying that it can not find the virtual node.

So please help us out to find what is wrong, I will go to church today and pray for Mesh! :)

Thank you in advance for any help that you can give to us.

the appmesh.yaml

# app mesh
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: desuper-cl-ngnix-vn
  namespace: desuper-cl
spec:
  podSelector:
    matchLabels:
      app: nginx-desuper
      tier: frontend
  listeners:
    - portMapping:
        port: 8080
        protocol: http
  backends:
    - virtualService:
        virtualServiceRef:
          name: php-srv
    - virtualService:
        virtualServiceRef:
          name: memcache-srv
  # backendDefaults:
  #   clientPolicy:
  #     tls:
  #       enforce: true
  #       validation:
  #         trust:
  #           acm:
  #             certificateAuthorityARNs:
  #               - arn:aws:acm-pca:us-east-1:xxx
  serviceDiscovery:
    dns:
      hostname: srv-desuper-nginx.desuper-cl.svc.cluster.local
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: desuper-cl-php-vn
  namespace: desuper-cl
spec:
  podSelector:
    matchLabels:
      app: php-desuper
      tier: backend
  listeners:
    - portMapping:
        port: 9000
        protocol: tcp
  backends:
    - virtualService:
        virtualServiceRef:
          name: nginx-srv
    - virtualService:
        virtualServiceRef:
          name: memcache-srv
  # backendDefaults:
  #   clientPolicy:
  #     tls:
  #       enforce: true
  #       validation:
  #         trust:
  #           acm:
  #             certificateAuthorityARNs:
  #               - arn:aws:acm-pca:us-east-1:xxxx
  serviceDiscovery:
    dns:
      hostname: srv-desuper-php.desuper-cl.svc.cluster.local
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: desuper-cl-memcache-vn
  namespace: desuper-cl
spec:
  podSelector:
    matchLabels:
      app: memcache-desuper
      tier: backend
  listeners:
    - portMapping:
        port: 11211
        protocol: tcp
      # tls:
      #   mode: PERMISSIVE
      #   certificate:
      #     acm:
      #       certificateARN: arn:aws:acm:us-east-1:xxxx
  serviceDiscovery:
    dns:
      hostname: srv-desuper-memcache.desuper-cl.svc.cluster.local
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: desuper-cl-sftp-vn
  namespace: desuper-cl
spec:
  podSelector:
    matchLabels:
      app: sftp-desuper
  listeners:
    - portMapping:
        port: 22
        protocol: tcp
      # tls:
      #   mode: PERMISSIVE
      #   certificate:
      #     acm:
      #       certificateARN: arn:aws:acm:us-east-1:xxxx
  serviceDiscovery:
    dns:
      hostname: desuper-cl-sftp-vn.desuper-cl.svc.cluster.local
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: nginx-srv
  namespace: desuper-cl
spec:
  awsName: nginx-srv.desuper-cl.svc.cluster.local
  provider:
    virtualNode:
        virtualNodeRef:
            name: desuper-cl-ngnix-vn
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: php-srv
  namespace: desuper-cl
spec:
  awsName: php-srv.desuper-cl.svc.cluster.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: desuper-cl-php-vro
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: memcache-srv
  namespace: desuper-cl
spec:
  awsName: memcache-srv.desuper-cl.svc.cluster.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: desuper-cl-cache-vro 
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: sftp-srv
  namespace: desuper-cl
spec:
  awsName: srv-desuper-sftp.desuper-cl.svc.cluster.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: desuper-cl-sftp-vro            
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  namespace: desuper-cl
  name: desuper-cl-php-vro
spec:
  listeners:
    - portMapping:
        port: 9000
        protocol: tcp
  routes:
    - name: php-route
      priority: 1
      tcpRoute:
        match:
          prefix: /
        action:
          weightedTargets:
            - virtualNodeRef:
                name: desuper-cl-php-vn
              weight: 100
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  namespace: desuper-cl
  name: desuper-cl-cache-vro
spec:
  listeners:
    - portMapping:
        port: 11211
        protocol: tcp
  routes:
    - name: cache-route
      priority: 1
      tcpRoute:
        match:
          prefix: /
        action:
          weightedTargets:
            - virtualNodeRef:
                name: desuper-cl-memcache-vn
              weight: 100              
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualGateway
metadata:
  name: ingress-gw
  namespace: desuper-cl
spec:
  namespaceSelector:
    matchLabels:
      gateway: ingress-gw
  podSelector:
    matchLabels:
      app: ingress-gw
  listeners:
    - portMapping:
        port: 8080
        protocol: http
      # tls:
      #   mode: PERMISSIVE
      #   certificate:
      #     acm:
      #       certificateARN: arn:aws:acm:us-east-1:xxxx
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: GatewayRoute
metadata:
  name: desuper-cl-nginx-gr
  namespace: desuper-cl
spec:
  httpRoute:
    match:
      prefix: "/"
    action:
      target:
        virtualService:
          virtualServiceRef: 
            name: nginx-srv
---
apiVersion: v1
kind: Service
metadata:
  name: ingress-gw
  namespace: desuper-cl
  #annotations:
  #  service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: 8080
      name: http
  selector:
    app: ingress-gw
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ingress-gw
  namespace: desuper-cl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ingress-gw
  template:
    metadata:
      labels:
        app: ingress-gw
    spec:
      serviceAccountName: desuper-cl-sc
      containers:
        - name: envoy
          image: 840364872350.dkr.ecr.us-east-1.amazonaws.com/aws-appmesh-envoy:v1.19.1.0-prod
          ports:
            - containerPort: 8080
          #env:
          #- name: APPMESH_RESOURCE_ARN
          #  value: arn:aws:appmesh:us-east-1:xxx

@msaustral I apologize for your experience so far in getting resolution on this issue. I'll follow up internally and see if I can help drive things.

Hi @jlbutler thank you for your time

Hi @jlbutler we just hangup with support and finally made it.

For anybody having this issue, it was the application that was causing the redirecting loop, this was because the envoy gateway by default change the request hostname to the namespace's hostname once the traffic gets inside the namespace.

To get to this issue we started by using a plain index.html, then php info and then the full app.

We manage to solve the issue using a new virtual gateway route option that goes on the action:

apiVersion: appmesh.k8s.aws/v1beta2
kind: GatewayRoute
metadata:
  name: desuper-cl-nginx-gr
  namespace: desuper-cl
spec:
  httpRoute:
    match:
      prefix: "/"
    action:
      rewrite: #****THIS LINE****
        hostname: #****THIS LINE****
          defaultTargetHostname: DISABLED #****THIS LINE****
      target:
        virtualService:
          virtualServiceRef: 
            name: nginx-srv

here is the documentation

https://aws.github.io/aws-app-mesh-controller-for-k8s/reference/api_spec/#appmesh.k8s.aws/v1beta2.GatewayRouteHostnameRewrite

IMPORTANT: the documentation has an error the value must be DISABLED with D at the end

It was difficult, it took a while, but it worth it, all that we have learn these months is priceless.

We are really thank you for all the help, I forgot to get the AWS team (5) names to thank them here, great team!

Have all a good weekend!!

Hi @jlbutler we just hangup with support and finally made it.

For anybody having this issue, it was the application that was causing the redirecting loop, this was because the envoy gateway by default change the request hostname to the namespace's hostname once the traffic gets inside the namespace.

To get to this issue we started by using a plain index.html, then php info and then the full app.

We manage to solve the issue using a new virtual gateway route option that goes on the action:

apiVersion: appmesh.k8s.aws/v1beta2
kind: GatewayRoute
metadata:
  name: desuper-cl-nginx-gr
  namespace: desuper-cl
spec:
  httpRoute:
    match:
      prefix: "/"
    action:
      rewrite: #****THIS LINE****
        hostname: #****THIS LINE****
          defaultTargetHostname: DISABLED #****THIS LINE****
      target:
        virtualService:
          virtualServiceRef: 
            name: nginx-srv

here is the documentation

https://aws.github.io/aws-app-mesh-controller-for-k8s/reference/api_spec/#appmesh.k8s.aws/v1beta2.GatewayRouteHostnameRewrite

IMPORTANT: the documentation has an error the value must be DISABLED with D at the end

It was difficult, it took a while, but it worth it, all that we have learn these months is priceless.

We are really thank you for all the help, I forgot to get the AWS team (5) names to thank them here, great team!

Have all a good weekend!!

Updated the documentation with Correct ENUMs

(Appears on: GrpcGatewayRouteRewrite, HTTPGatewayRouteRewrite)

GatewayRouteHostnameRewrite refers to https://docs.aws.amazon.com/app-mesh/latest/APIReference/API_GatewayRouteHostnameRewrite.html Accepted values: ENABLED or DISABLED for default behavior of Hostname rewrite

Glad to see that new feature for Hostname came to use in a timely manner. Closing this issue now.