openshift / ci-operator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to push images to private registry on OCP 3.11

shlao opened this issue · comments

HI, I try to setup the environment for ci-operator to watch Installer
Then I get the following errors when pushing images to the OCP 3.11 private registry:
*** Image of the release container is : registry.svc.ci.openshift.org/openshift/origin-v4.0@sha256:08a5da4ce4fa7b95393c15ff7ab0ea1a97edf5aa2f9bef6150e60370fd641555

a) ci-operator has not been tested with self-signed docker registry, and I get the errors:
error: unable to check your credentials - pass --skip-check to bypass this error: Get https://docker-registry.default.svc:5000/v2/: x509: certificate signed by unknown authority

b) errors from command: oc adm release, in container 'registry.svc.ci.openshift.org/openshift/origin-v4.0'
I) error: only image streams with public image repositories can be the source for releases when using the default --reference-mode
II) error: --to-image was not valid: invalid reference format

c) Registry access problem:
denied: requested access to the resource is denied

For Problem A), I update the ca with commands:
$ docker cp /etc/origin/master/ca-bundle.crt f4916a7c15f5:/etc/pki/ca-trust/source/anchors/openshift-ca.crt
[root@f4916a7c15f5 ~]# update-ca-trust enable

But I don't know how to fix the problems of b, c.

Hi, thanks for the report! To be able to help you out, we'll need more information from you:

  1. What cluster are you trying to run ci-operator against? I understand it's some custom OCP 3.11 cluster?
  2. Please provide exact inputs/outputs you are using. Ideally, give us the full ci-operator command you are trying to run (with all parameters), the ci-operator config file content you use and any relevant environmental variable content (like JOB_SPEC).

More information about what you are trying to achieve would be helpful as well. Could you elaborate more on "setup the environment for ci-operator to watch installer"?

@petr-muller Thanks

  1. OCP 3.11 is installed on two nodes: one master, one infra. I use the ansible playbooks to setup the cluster and follow the guide to configure the private registry (with ssl).

  2. I don't set any environment variables, like JOB_SPEC. I start ci-operator (on the master node), as follows:
    ci-operator --v --git-ref=shlao/installer@master
    --config /root/go/src/github.com/openshift/release/ci-operator/config/openshift/installer/openshift-installer-master.yaml
    --template /root/go/src/github.com/openshift/release/ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml

  3. Here is the excerpt.

$ oc get pod release-latest -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    ci-operator.openshift.io/container-sub-tests: release
    ci.openshift.io/job-spec: ""
    openshift.io/scc: anyuid
  creationTimestamp: 2019-02-02T02:41:09Z
  labels:
    build-id: ""
    created-by-ci: "true"
    job: dev
    persists-between-builds: "false"
    prow.k8s.io/id: ""
  name: release-latest
  namespace: ci-op-7ki07xfl
  ownerReferences:
  - apiVersion: image.openshift.io/v1
    controller: true
    kind: ImageStream
    name: pipeline
    uid: 70973860-2693-11e9-895c-fa163e380b62
  resourceVersion: "6231"
  selfLink: /api/v1/namespaces/ci-op-7ki07xfl/pods/release-latest
  uid: ff695cbf-2693-11e9-895c-fa163e380b62
spec:
  containers:
  - command:
    - /bin/sh
    - -c
    - |
      #!/bin/sh
      set -eu

      set -euo pipefail
      export HOME=/tmp
      oc registry login
      oc adm release new --max-per-registry=32 -n "ci-op-7ki07xfl" --from-image-stream "stable" --to-image-base "docker-registry.default.svc:5000/ci-op-7ki07xfl/stable@sha256:34abb8345b741ad2da356f7bf380cefe511f1d1380df9bc7d77a8e01b51fadd7" --to-image ":latest"
      oc adm release extract --from=":latest" --to=/tmp/artifacts/release-payload
    image: registry.svc.ci.openshift.org/openshift/origin-v4.0@sha256:08a5da4ce4fa7b95393c15ff7ab0ea1a97edf5aa2f9bef6150e60370fd641555
    imagePullPolicy: IfNotPresent
    name: release
    securityContext:
      capabilities:
        drop:
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: builder-token-85z26
      readOnly: true
  imagePullSecrets:
  - name: builder-dockercfg-np72j
  nodeName: qe-shlao-cimerrn-1
  nodeSelector:
    node-role.kubernetes.io/compute: "true"
  securityContext:
    seLinuxOptions:
      level: s0:c17,c9
  serviceAccount: builder
  serviceAccountName: builder
  terminationGracePeriodSeconds: 30
  tolerations:
  - name: builder-token-85z26
    secret:
      defaultMode: 420
      secretName: builder-token-85z26
status:
  containerStatuses:
  - containerID: docker://6309ccf407f1076abf78a59eb1e3ee4bf496e0fb514375634d6948f99e2f0d09
    image: registry.svc.ci.openshift.org/openshift/origin-v4.0@sha256:08a5da4ce4fa7b95393c15ff7ab0ea1a97edf5aa2f9bef6150e60370fd641555
    imageID: docker-pullable://registry.svc.ci.openshift.org/openshift/origin-v4.0@sha256:08a5da4ce4fa7b95393c15ff7ab0ea1a97edf5aa2f9bef6150e60370fd641555
    lastState: {}
    name: release
    ready: false
    restartCount: 0
    state:
      terminated:
        containerID: docker://6309ccf407f1076abf78a59eb1e3ee4bf496e0fb514375634d6948f99e2f0d09
        exitCode: 1
        finishedAt: 2019-02-02T02:41:16Z
        message: |
          info: Using internal registry hostname docker-registry.default.svc:5000
          error: unable to check your credentials - pass --skip-check to bypass this error: Get https://docker-registry.default.svc:5000/v2/: x509: certificate signed by unknown authority
        reason: Error
  1. Here is the command that raises the errors:
    oc adm release new --max-per-registry=32 -n "ci-op-7ki07xfl" --from-image-stream "stable" --to-image-base "docker-registry.default.svc:5000/ci-op-7ki07xfl

It seems that the release pod doesn't has the ca-bundle of the private registry and the secret doesn't has the privilege to push images : builder-token-85z26

Where can I set the secret and how?

Thanks

The output log is similar to logs

But failed at the release step.

You can use triple backticks to pass log or code blocks properly formatted. Please provide the output of ci-operator command (the one you give in bullet 2), ideally after you fix the SSL setup (I don't think we'll want to deal with running ci-operator against misconfigured registries, like ones with self-signed certificate).

Anyway, executing ci-operator against a custom cluster is quite uncommon. What are you trying to achieve? If you want to experiment with executing it or maybe with developing test template we provide a staging namespace on api.ci for that.

@petr-muller The log is too long. I attach it (use cluster-launch-installer-libvirt-e2e.yaml).
ci-operator-installer.log.txt

For now, There are only a few e2e test cases for installer. We ,OCP QE, need to add more test-cases about that.
For example, when Upstream releasing a new version of Installer, we want to use it to deploy OCP on Openstack 12, 13, 14...

That's still the failure caused by self-signed certificate, no?

Yes, you are right. I can handle the ca failed. then ci-operator will raise problems of b and c , in release-latest pod:
oc adm release new --max-per-registry=32 -n "ci-op-7ki07xfl" --from-image-stream "stable" --to-image-base "docker-registry.default.svc:5000/ci-op-7ki07xfl

Use --to-file instead of --to-image-b
or
Using OCP ci cluster