kubewharf / kelemetry

Global control plane tracing for Kubernetes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to aggregate multicluster event by kelemetry

zebhuang opened this issue · comments

From k8s/config/mapoption, there has two field targetClusterName and kubeconfig, seem like meanning kelemetry can watch multicluster event in one master cluster(controlplane cluster). But i just found the informer only watch targetCluster, the other cluster seem like just for the diff api.So i wanna know how to aggregate multicluster event by kelemetry. I have some guess, please correct me if I am wrong:

  1. Deploy kelemetry per cluster with in-cluster mode.
  2. Deploy kelemetry per cluster with out-cluster mode in the controlerplane cluster.

Each Kelemetry controller instance can only watch events from one cluster, namely its "target cluster". The recommended approach is like this:

  • Deploy central diff cache + span cache databases
  • Deploy central webhook + consumer instances and connect to the central cache databases
  • Deploy in-cluster informers instances and connect to the central cache databases.
graph TB
  subgraph central deployment
    consumer[webhook + consumer]
    cache[diff & span cache]
    consumer --> cache
  end
  subgraph cluster1
    apiserver1[kube-apiserver]
    informers1[informers]
    apiserver1 --> |list-watch| informers1
  end
  apiserver2 -->|audit\nwebhook| consumer
  subgraph cluster2
    apiserver2[kube-apiserver]
    informers2[informers]
    apiserver2 --> |list-watch| informers2
  end
  apiserver1 -->|audit\nwebhook| consumer
  informers1 --> cache
  informers2 --> cache
Loading

But due to multi-cluster linking (if you e.g. use the annotation linker or the LinkRule linker in #109), informers and consumer also need out-of-cluster GET access to all other apiservers. That is, if an informer/consumer processes audit/events from cluster1 but the processed object references a parent-link in cluster2, then the informer/consumer also needs to know how to access the parent object in order to generate transitive parents. (omitted in the diagram since full connection is too messy)

The current helm chart does not explicitly support multi-cluster deployment due to various cluster management techniques, but the rough idea is to do this:

  1. Deploy an etcd cluster and an ElasticSearch cluster without using the Helm chart from Kelemetry.
  2. Edit values.yaml:
  • Update multiCluster.clusters include connection details for all clusters.
  • Change externalEndpoint: false for diffCache and spanCache to the URL for the etcd cluster deployed in step 1. (optionally, also change for traceCache to share search results between frontend instances)
  • Change storageBackend.type to elasticsearch and update storageBackend.options to point to the ElasticSearch cluster deployed in step 1.
  1. For each cluster, update multiCluster.currentClusterName to its cluster name and deploy the chart for that cluster.

Nevertheless, if you expect no inter-cluster parent references, you can skip all these steps and just switch the storageBackend to a common storage backend.