zilliztech / milvus-helm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

couldn't connect to milvus k8s ingress via tls

hive74 opened this issue · comments

Hello,
I'm using milvus db in k8s as standalone, have tls.crt and tls.key for my ingress dns and put it in standalone-pod via secretName: milvus-tls. CA-cert is also added to standalone-pod in /etc/ssl/certs. Certs are valid. Config milvus tls:

Python 3.10.12
protobuf 3.20.0
milvus-4.1.17
grpcio-tools 1.53.0
Milvus cli version: 0.4.2
Pymilvus version: 2.3.4

extraConfigFiles:

  user.yaml: |
    tls:
      serverPemPath: /tmp/tls.crt
      serverKeyPath: /tmp/tls.key
    common:
      security:
        tlsMode: 1

Ingress by default:

    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/listen-ports-ssl: '[19530]'
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
rules:
  - host: k8s-milvus.example.com
    http:
      paths:
      - backend:
          service:
            name: my-release-milvus
            port:
              number: 19530
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - k8s-milvus.example.com
    secretName: milvus-tls

I get 502 in browser and Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED. trying connect via python script like connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem")

What I tried:

If disable TLS on milvus, drop ingress-line nginx.ingress.kubernetes.io/backend-protocol: GRPC and keep tls on ingress - I get 404 in browser (thats good) and CERTIFICATE_VERIFY_FAILED via script.

If connect via 80 port without milvus-tls I get Handshake failed with fatal error SSL_ERROR_SSL: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER.

Tried to mix params like server_pem_path, ca_pem_path, client_pem_path and etc.

Without milvus-tls in minikube and port-forward standalone-pod - it's well connecting. Through ingress its also dont work even with simple\milvus-default ingress. Maybe its main problem.

All pods are running without errors in logs. How can I connect to milvus db via python script? How fix ssl error? I can't disable ingress tls, but can do it on milvus db, if same (Ingress TLS, Milvus noTLS) config is possible.

if you use tlsMode 1 for milvus, the annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC value for ingress should be GRPCS

see https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#backend-protocol for more information.

Or you can also leave nginx.ingress.kubernetes.io/backend-protocol: GRPC, and set tlsMode to 0 for milvus.
In this way, the traffic from nginx to milvus will be Plain text GRPC.

Now I try noTLS config:
Values.yaml

extraConfigFiles:
  user.yaml: |
   tls:
     serverPemPath: /milvus/configs/cert/server.pem
     serverKeyPath: /milvus/configs/cert/server.key
   common:
     security:
       tlsMode: 0

Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-release-milvus
  labels:
    helm.sh/chart: milvus-4.1.17
    app.kubernetes.io/name: milvus
    app.kubernetes.io/instance: my-release
    app.kubernetes.io/version: "2.3.8"
    app.kubernetes.io/managed-by: Helm
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
spec:
  defaultBackend:
    service:
      name: my-release-milvus
      port:
        number: 19530
  rules:
  - host: milvus.example.com
    http:
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: my-release-milvus
              port:
                number: 19530

I get 404 in browser and

curl http://milvus.example.com:80
404 page not found

But in VSC when running script
connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus')
I get

Traceback (most recent call last):
  File "/home/volodinyk/milvus/createuser.py", line 20, in <module>
    connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus')
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/orm/connections.py", line 356, in connect
    connect_milvus(**kwargs, user=user, password=password, token=token, db_name=db_name)
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/orm/connections.py", line 302, in connect_milvus
    gh._wait_for_channel_ready(timeout=timeout)
  File "/home/volodinyk/.local/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 136, in _wait_for_channel_ready
    raise MilvusException(
pymilvus.exceptions.MilvusException: <MilvusException: (code=2, message=Fail connecting to server on milvus.example.com:80. Timeout)>

Only way when script works, I make port-forward from service and connection is well
kubectl port-forward service/my-release-milvus -n milvus 27017:19530

The annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC is necessary.

The annotation nginx.ingress.kubernetes.io/backend-protocol: GRPC is necessary.

I added it to my notls config (previous message), then I get 502 in browser and curl
via script also

import time
import numpy as np
from pymilvus import (
    connections, db,
    utility,
    FieldSchema, CollectionSchema, DataType,
    Collection, Role
)

fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 3000, 8

print(fmt.format("start connecting to Milvus"))
connections.connect("default", uri="http://milvus.example.com:80", user='root', password='Milvus')
 `(code=2, message=Fail connecting to server on milvus.example.com:80. Timeout)>`

@hive74 looks like the frontend TLS is necessary for nginx ingress to proxy GRPC.

Update your ingress like below:

annotations:
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
rules:
  - host: k8s-milvus.example.com
    http:
      paths:
      - backend:
          service:
            name: my-release-milvus
            port:
              number: 19530
        path: /
        pathType: ImplementationSpecific
  tls:
  - hosts:
    - k8s-milvus.example.com
    secretName: milvus-tls

Yet, keep the backend milvus tlsMode=0

Then connect with
connections.connect("default", uri="https://k8s-milvus.example.com:443", user='root', password='Milvus', secure=True)

Thanks for reply, but i again get 502 in browser and via script Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED <MilvusException: (code=2, message=Fail connecting to server on milvus.example.com:443. Timeout)>

What I did in local minikube:

  • generate certs via ./gen.sh with CommonName="milvus.example.com", other params by default
  • kubectl create secret tls milvus-tls -n milvus --key="/home/testuser/milvus/cert/server.key" --cert="/home/testuser/milvus/cert/server.pem"
  • values.yaml
extraConfigFiles:
  user.yaml: |
   tls:
     serverPemPath: /milvus/configs/cert/server.pem
     serverKeyPath: /milvus/configs/cert/server.key
   common:
     security:
       tlsMode: 0
  • ingress like your example
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-release-milvus
  labels:
    helm.sh/chart: milvus-4.1.17
    app.kubernetes.io/name: milvus
    app.kubernetes.io/instance: my-release
    app.kubernetes.io/version: "2.3.8"
    app.kubernetes.io/managed-by: Helm
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
spec:
  rules:
  - host: milvus.example.com
    http:
      paths:
        - path: /
          pathType: ImplementationSpecific
          backend:
            service:
              name: my-release-milvus
              port:
                number: 19530
  tls:
  - hosts:
    - milvus.example.com
    secretName: milvus-tls
  • deploy standalone
    helm upgrade --cleanup-on-fail --install my-release milvus/milvus --set cluster.enabled=false --values values.yaml --set etcd.replicaCount=1 --set minio.mode=standalone --set pulsar.enabled=false -n milvus
  • run script
    connections.connect("default", uri="https://milvus.example.com:443", user='root', password='Milvus', secure=True)

I did it also in full k8s-cluster with existed valid certs and result is same (502 in browser, cert error via script)

Where is my fault? Its ok to get 502 in browser?

em, I'm gonna need more information to be sure what's going on. Could you try use command curl -v https://milvus.example.com:443 and paste the output here?

Oh, I know what's going on.

You're using the private certificate signed by yourself. So you need to add the server.pem when connecting.

Try this:
connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

em, I'm gonna need more information to be sure what's going on. Could you try use command curl -v https://milvus.example.com:443 and paste the output here?

different tries
https://pastebin.com/XH8ab2fX

Oh, I know what's going on.

You're using the private certificate signed by yourself. So you need to add the server.pem when connecting.

Try this: connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/cert/server.pem")

Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED. 
(code=2, message=Fail connecting to server on milvus.example.com:443. Timeout)>

Want to note, that ingress tls certs are generated by gen.sh and i try connect with them *.pem. But in user.yaml using defaults certs generating by milvus before start and tlsMode: 0, can it affect? may be need any certs?

extraConfigFiles:
  user.yaml: |
   tls:
     serverPemPath: /milvus/configs/cert/server.pem
     serverKeyPath: /milvus/configs/cert/server.key
   common:
     security:
       tlsMode: 0

Want to note, that ingress tls certs are generated by gen.sh and i try connect with them *.pem. But in user.yaml using defaults certs generating by milvus before start and tlsMode: 0, can it affect? may be need any certs?

You don't need to enable tls on milvus. tls terminates at nginx ingress, nginx communicate with backend milvus using plaintext GRPC . So no worries about this.

alter a bit:

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

@XuanYang-cn These configurations are indeed very confusing.... For now server_pem_path is not working without server_name. This seems to be a bug.

alter a bit:

connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

run
connections.connect("default", host="milvus.example.com", port="443", secure=True, server_pem_path="/home/testuser/milvus/ca.pem", server_name="milvus.example.com")

get Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.

i get these certs after

chmod +x gen.sh
./gen.sh
/milvus/cert$ ls 
ca.key  ca.pem  client.csr  client.key  client.pem  gen.sh  openssl.cnf  server.csr  server.key  server.pem

i keep server.pem and server.key as tls-secret and put it to ingress secret:
kubectl create secret tls milvus-tls -n milvus --key="/home/testuser/milvus/cert/server.key" --cert="/home/testuser/milvus/cert/server.pem"

Country="CN"
State="Shanghai"
Location="Shanghai"
Organization="milvus"
Organizational="milvus"
CommonName="milvus.example.com"
  tls:
  - hosts:
    - milvus.example.com
    secretName: milvus-tls

I GOT IT!!
Trying on k8s-cluster (non minikube) with CA-cert which I have in my ubuntu-storage etc/ssl/certs/ca-certificates.crt
connections.connect("default", host="k8s-milvus.example.com", port="443", secure=True, server_pem_path="/etc/ssl/certs/ca-certificates.crt", server_name="k8s-milvus.example.com")
Connection it done!

Problem was in missing of server_name="k8s-milvus.example.com"

In minikube will research it, maybe trouble with CA, minikube config certs is upper

Thank you so much @haorenfsa

Oh, good catch! The server_name thing is indeed a bug to me. We'll fix it soon. Happy hacking with Milvus!

So extra guide should be added using minikube. Most people don't have a real k8s-cluster to play with.

Fixed in minikube by this task
minikube waited Kubernetes Ingress Controller Fake Certificate, needed to custom it

I’m getting another error when trying to connect to Milvus when using nginx ingress on Minikube to handle the TLS. I’m trying to create a (http/grpc) proxy from nginx to the Milvus service. I have a valid TLS certificate to test as a secret: cert.

helm install my-release milvus/milvus -f values.yml

values.yml:

cluster:
  enabled: false

standalone:
  persistence:
    mountPath: "/var/lib/milvus"
    enabled: true
    persistentVolumeClaim:
      accessModes: ReadWriteOnce
      size: 10Gi
      subPath: ""

minio:
  enabled: true

etcd:
  enabled: true
  name: etcd
  replicaCount: 1
  pdb:
    create: false

  service:
    type: ClusterIP
    port: 2379
    peerPort: 2380

ingress:
  enabled: true
  annotations:
    # Annotation example: set nginx ingress type
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  labels: {}
  rules:
    - host: "milvus.mysite.be"
      path: "/"
      pathType: "Prefix"
  tls: 
    - secretName: cert
      hosts:
        - milvus.mysite.be

extraConfigFiles:
  user.yaml: |+
    tls:
      serverPemPath: /milvus/configs/cert/server.pem
      serverKeyPath: /milvus/configs/cert/server.key
    common:
      security:
        tlsMode: 0
    
auth:
  rbac:
    enabled: false

persistence:
  enabled: true
  storageClass: standard
  accessMode: ReadWriteOnce
  size: 10Gi

pulsar:
  enabled: false

When I'm trying to connect via python:

from pymilvus import connections, db, connections, utility

connections.connect("default", host="milvus.mysite.be", port="443", secure=True, server_pem_path="/path/to/root-ca.pem", server_name="milvus.mysite.be")

print(db.list_database(using="default"))

print(utility.list_collections(timeout=None, using='default'))

E0325 11:26:37.585249639 133238 hpack_parser.cc:993] Error parsing 'content-type' metadata: invalid value

Hi @indyvanmol , please check if milvus.mysite.be is correctly resolved to the ip of nginx ingress. And try curl --http2 https://milvus.mysite.be/ to see if there's any clues from the output

@haorenfsa

curl --http2 https://milvus.mysite.be

returns: 404 page not found

@indyvanmol that means the ingress is not created correctly. what's the output of kubectl describe ingress?

kubectl get ingress my-release-milvus -o yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    meta.helm.sh/release-name: my-release
    meta.helm.sh/release-namespace: default
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/listen-ports-ssl: '[19530]'
    nginx.ingress.kubernetes.io/proxy-body-size: 4m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  creationTimestamp: "2024-03-19T12:02:07Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: my-release
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: milvus
    app.kubernetes.io/version: 2.3.9
    helm.sh/chart: milvus-4.1.18
  name: my-release-milvus
  namespace: default
  resourceVersion: "346877"
  uid: dc8a3675-62c5-4a6b-a5fe-540b8b58507c
spec:
  defaultBackend:
    service:
      name: my-release-milvus
      port:
        number: 19530
  rules:
  - host: milvus.mysite.be
    http:
      paths:
      - backend:
          service:
            name: my-release-milvus
            port:
              number: 19530
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - milvus.mysite.be
    secretName: cert
status:
  loadBalancer: {}
status:
  loadBalancer: {}

No loadBalancer attached, which means your nginx-ingress-controller is not correctly setup. Please refer to this doc for the setup procedures https://milvus.io/docs/ingress.md

The docs you’re showing are for an Azure setup. I have an on-premise setup and my Nginx ingress controller is working for other HTTP services where I’m proxying to.

What kind of protocol is used for Milvus? Is it HTTP, and does an HTTP proxy to Milvus work? Are the examples showed with Nginx just TCP forwarding and not HTTP? So, with the Nginx ingress examples, TLS encryption is done on at the TCP level, but the proxy is using TCP and not HTTP. By proxy i mean the channel from nginx ingress to the service.

The docs you’re showing are for an Azure setup.

@indyvanmol It's a mistake to put https://milvus.io/docs/ingress.md under Azure section, it can be used anywhere, we'll update the doc soon.

I have an on-premise setup and my Nginx ingress controller is working for other HTTP services where I’m proxying to.

We would need the output of kubectl describe ingress to see the events why the loadbalancer IP is not assigned. The kubectl get command wouldn't provide the events information we need.

What kind of protocol is used for Milvus? Is it HTTP, and does an HTTP proxy to Milvus work?

Milvus uses gRPC, and gRPC is built on top of HTTP2. Any HTTP proxy supports HTTP2 would work for milvus.

Are the examples showed with Nginx just TCP forwarding and not HTTP?

Yes, in https://milvus.io/docs/azure.md.
You only need to set service.type in values.yaml

service:
  type: LoadBalancer

So, with the Nginx ingress examples, TLS encryption is done on at the TCP level, but the proxy is using TCP and not HTTP. By proxy i mean the channel from nginx ingress to the service.

No. If you use ingress, then the TLS encryption is done at nginx proxy (the HTTP layer). The channel from nginx ingress to the milvus service is in plaintext gRPC (i.e. HTTP2 over raw TCP not TLS). Usually HTTP2 is used together with TLS, but It's a special case.

If you uses service, there's no tls (unless you add some specific annotations to the service and provide tls certificates & keys). The client communicate to milvus in plaintext gRPC, the same as the nginx proxy communciate to the backend in the ingress' case.

Thank you very much for the feedbacks. I'm sry for all the toubles caused by the docs, they should be better organized. I'll see to this done.

@haorenfsa thanks for helping me i hope this helps you to give you some more insight on how to help making the docs better.

kubectl describe ingress my-release-milvus

Name:             my-release-milvus
Labels:           app.kubernetes.io/instance=my-release
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=milvus
                  app.kubernetes.io/version=2.3.9
                  helm.sh/chart=milvus-4.1.18
Namespace:        default
Address:          
Ingress Class:    <none>
Default backend:  my-release-milvus:19530 (10.244.0.63:19530)
TLS:
  cert terminates milvus.mysite.be
Rules:
  Host                 Path  Backends
  ----                 ----  --------
  milvus.mysite.be  
                       /   my-release-milvus:19530 (10.244.0.63:19530)
Annotations:           kubernetes.io/ingress.class: nginx
                       meta.helm.sh/release-name: my-release
                       meta.helm.sh/release-namespace: default
                       nginx.ingress.kubernetes.io/backend-protocol: GRPC
                       nginx.ingress.kubernetes.io/listen-ports-ssl: [19530]
                       nginx.ingress.kubernetes.io/proxy-body-size: 4m
                       nginx.ingress.kubernetes.io/ssl-redirect: true
Events:                <none>

@indyvanmol I'm quite sure your nginx ingress was not installed or configured correctly. Please try following this doc's instruction on installation https://milvus.io/docs/ingress.md

@haorenfsa I installed Nginx as described in the documentation and it indeed works, for which I am thankful. However, I suggest that a section be added to the documentation detailing the specific configuration required for Nginx. This is because I installed Nginx using the manifest files, not via Helm, so it’s primarily about configuration. However, it’s not clear to me what specific configuration is needed. Regardless, I appreciate the help and am pleased that it works. I find it interesting to know what kind of settings are expected of a proxy globally.

@indyvanmol We'll add a section about nginx later. Thank you very much for the suggestion!