istio / test-infra

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Setup k8s cluster for aarch64 for PROW

morlay opened this issue · comments

We could switch docker buildx builder driver from docker-container to kubernetes

Once we an aarch64 node (at least 12cpu) Prow k8s cluster, we could upgrade some prow.yaml

Example for containers-test_tools https://prow.istio.io/prowjob?prowjob=8db095b7-dac8-11eb-ac46-e64a9a7495cb

First important step for all, should add a nodeSelector to make sure old jobs work well

nodeSelector:
      testing: test-pool
      "kubernetes.io/arch": "amd64"

Then we could create container-builder

export BUILDKIT_NAMESAPCE=buildkit
export BUILDKIT_IMAGE=gcr.io/istio-testing/buildkit:v0.8.3
export DRIVER_COMMON_OPTS=namespace=${BUILDKIT_NAMESAPCE},image=${BUILDKIT_IMAGE}
export BUILDKIT_FLAGS=--debug

docker buildx create --name=container-builder --platform=linux/amd64 --node=container-builder-amd64 --driver=kubernetes --driver-opt=${DRIVER_COMMON_OPTS},nodeselector="kubernetes.io/arch=amd64" --buildkitd-flags="${BUILDKIT_FLAGS}"
docker buildx create --append --name=container-builder --platform=linux/arm64  --node=container-builder-arm64 --driver=kubernetes --driver-opt=${DRIVER_COMMON_OPTS},nodeselector="kubernetes.io/arch=arm64"  --buildkitd-flags="${BUILDKIT_FLAGS}" 

The service-account for job should have full access of Deployments and Pods in namespace buildkit(or other namespace)
(could the service-account mounted have those permission? @howardjohn )

If not easy to add a aarch64 host in Google Cloud, could setup multi-arch container-builder in other environment, then link the remote one by KUBECONFIG

export KUBECONFIG=./path/to/kubeconfig.for.remote

Then we could have a multi-arch builder

$ docker buildx inspect container-builder --bootstrap
[+] Building 15.3s (2/2) FINISHED                                                                                                                                           
 => [container-builder-amd64 internal] booting buildkit                                                                                                                9.6s
 => => waiting for 1 pods to be ready                                                                                                                                  9.5s
 => [container-builder-arm64 internal] booting buildkit                                                                                                               15.2s
 => => waiting for 1 pods to be ready                                                                                                                                 15.1s
Name:   container-builder
Driver: kubernetes

Nodes:
Name:      container-builder-amd64
Endpoint:  kubernetes://container-builder?deployment=container-builder-amd64
Status:    running
Flags:     --debug
Platforms: linux/amd64*, linux/386

Name:      container-builder-arm64
Endpoint:  kubernetes://container-builder?deployment=container-builder-arm64
Status:    running
Flags:     --debug
Platforms: linux/arm64*
$ kubectl get pods --namespace=buildkit --output=custom-columns=NAME:.metadata.name,STATUS:.status.phase,NODE_SELECTOR:.spec.nodeSelector
NAME                                       STATUS    NODE_SELECTOR
container-builder-amd64-fcb75ccb9-c9rms    Running   map[kubernetes.io/arch:amd64]
container-builder-arm64-6c87b866cc-z5mqm   Running   map[kubernetes.io/arch:arm64]

When we use docker buildx build --platform=linux/amd64,linux/arm64
we could build for each archs at same time.
In this way, could easy to work well with current CI workflow.

Is it possible that instead of creating one prowjob which spins up a pod to build arm64 and amd64 images, we instead create two prowjobs, one for arm64 and one for amd64 which build locally? Then we join the images using docker manifest command or equivalent.

I think that will facilitate testing better as well, since then we can build+test on the same pod. Its also a lot more secure (no need to have huge permission of pod create) and portable (no dependency on Kubernetes - today we just use Kubernetes as a dump pod scheduler, the jobs could easily be ran in any other environment)

@howardjohn

Yes, it could be, but project like tools, proxy may need to changes a lot to bump the new workflow.

@howardjohn could you help @AWSjswinney to setup an env for arm64 builds.

I understand your points.
so let's do like that.

I think each project is different:

istio/proxy: we don't use docker at all, its all bazel. Pretty much need different machines here I think
istio/istio: we use docker, but its just copying files. We don't actually do any operations inside docker. Likely QEMU is sufficient here (TODO prove this) since its just copys
istio/tools: the only case we do a "traditionally" docker build where this would make sense. For this case its probably simple enough to do what is described above.

Yes. We could just setup a cluster for arm64 only.

Once this PR merge istio/istio#33763. (hope it could be before release-1.11)
we could easy to build arm64 only or multi-arch images of istios from origin sources without custom changes (should ensure iptables use multi-arch sha256 tag too).

I start to setup a CI in github-actions with self-hosted arm64 runner.
https://github.com/querycap/istio/blob/build-origin/Makefile

  • istio/tools require aarch64 host for building build-tools-proxy image for arm64
    • because lots of compiling jobs, with QEMU will be very very slow.
  • istio/proxy require aarch64 host for compiling envoy binary for arm64.
  • istio/istio can work well with QEMU. GO binaries can run well in x86_64 host or aarch64 host.

When cluster for arm64 ready, istio could offical release multi-arch images.

The 'red' step is we need todo.

graph LR
    classDef todo fill:#ffb3b3,stroke:#000;
    classDef repo fill:#ffb808,stroke:#000;

    object-storage[(object storage)]

    subgraph prow-arm64
       image-build-tools-proxy-arm64(build-tools-proxy:arm64):::todo
        -.->build-envoy-arm64((build envoy arm64)):::todo
    end

    subgraph prow-amd64
        image-build-tools-proxy-amd64(build-tools-proxy:amd64)
        -.->build-envoy-amd64((build envoy amd64))

        image-build-tools-amd64(build-tools:amd64)
        -.->build-istio-images((build istio images))
        -->istio-images(istio/*:*)
    end

    istio-build-tools:::repo
    -->|clone & build image| image-build-tools-amd64 & image-build-tools-proxy-amd64 & image-build-tools-proxy-arm64

    istio-proxy:::repo
    -->|clone| build-envoy-amd64 & build-envoy-arm64 
    
    build-envoy-amd64 --> |envoy-amd64| object-storage
    build-envoy-arm64 --> |envoy-arm64| object-storage
    object-storage -->|download envoy-*| build-istio-images

    istio:::repo
    -->|clone| build-istio-images
Loading

This is done now .Thanks everyone!