openshift / openshift-sdn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running into issues when using s2image

zak-hassan opened this issue · comments

commented

For some reason it is failing on the node.

[My Setup]
3 node centos 7.0. I used the ansible installer to provision the cluster. For some reason it is not allowing me to use source to image. I'm using openshift origin 3.1 .

I thought to open a ticket under openshift/origin but after looking at this discussion it could be openshift-sdn:

openshift/origin#6071

Here are my logs.

[root@node1 ~]# docker logs f17d3c09dfec
I0327 19:42:11.517302 1 builder.go:57] Master version "v1.1.4", Builder version "v1.1.4"
I0327 19:42:11.521576 1 builder.go:145] Running build with cgroup limits: api.CGroupLimits{MemoryLimitBytes:9223372036854775807, CPUShares:2, CPUPeriod:100000, CPUQuota:-1, MemorySwap:9223372036854775807}
I0327 19:42:11.524237 1 sti.go:195] The value of ALLOWED_UIDS is [1-]
I0327 19:42:11.524291 1 sti.go:203] The value of DROP_CAPS is [KILL,MKNOD,SETGID,SETUID,SYS_CHROOT]
I0327 19:42:11.530869 1 docker.go:287] Image "openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8" not available locally, pulling ...
I0327 19:42:11.530909 1 docker.go:309] Pulling Docker image openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8 ...
I0327 19:44:39.763505 1 sti.go:222] Creating a new S2I builder with build config: "Builder Name:\t\t\tWildFly 10.0.0.Final\nBuilder Image:\t\t\topenshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8\nBuilder Image Version:\t\tf127204\nBuilder Base Version:\t\t0c1672f\nSource:\t\t\t\tfile:///tmp/s2i-build416721069/upload/src#master\nOutput Image Tag:\t\taxep/sample-jee-app-1:7899687a\nEnvironment:\t\t\tOPENSHIFT_BUILD_NAME=sample-jee-app-1,OPENSHIFT_BUILD_NAMESPACE=axep,OPENSHIFT_BUILD_SOURCE=https://github.com/bparees/openshift-jee-sample.git,OPENSHIFT_BUILD_REFERENCE=master\nIncremental Build:\t\tdisabled\nRemove Old Build:\t\tdisabled\nBuilder Pull Policy:\t\tif-not-present\nPrevious Image Pull Policy:\talways\nQuiet:\t\t\t\tdisabled\nLayered Build:\t\t\tdisabled\nWorkdir:\t\t\t/tmp/s2i-build416721069\nDocker NetworkMode:\t\tcontainer:f17d3c09dfec8f42cc7df82d716a6e2a76d0e213959e3b086224ba98c1ae7b09\nDocker Endpoint:\t\tunix:///var/run/docker.sock\n"
I0327 19:44:39.785854 1 docker.go:291] Using locally available image "openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8"
I0327 19:44:39.860141 1 docker.go:291] Using locally available image "openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8"
I0327 19:44:39.860218 1 docker.go:411] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0327 19:44:39.863973 1 sti.go:140] Preparing to build axep/sample-jee-app-1:7899687a
I0327 19:44:39.864581 1 source.go:197] Downloading "https://github.com/bparees/openshift-jee-sample.git" ...
I0327 19:44:40.716517 1 source.go:208] Cloning source from https://github.com/bparees/openshift-jee-sample.git
I0327 19:44:42.220767 1 install.go:236] Using "assemble" installed from "image:///usr/libexec/s2i/assemble"
I0327 19:44:42.220845 1 install.go:236] Using "run" installed from "image:///usr/libexec/s2i/run"
I0327 19:44:42.220895 1 install.go:236] Using "save-artifacts" installed from "image:///usr/libexec/s2i/save-artifacts"
I0327 19:44:42.220964 1 sti.go:152] Clean build will be performed
I0327 19:44:42.221003 1 sti.go:155] Performing source build from file:///tmp/s2i-build416721069/upload/src#master
I0327 19:44:42.221017 1 sti.go:166] Running "assemble" in "axep/sample-jee-app-1:7899687a"
I0327 19:44:42.221042 1 sti.go:424] Using image name openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8
I0327 19:44:42.221075 1 sti.go:428] No user environment provided (no environment file found in application sources)
I0327 19:44:42.221231 1 sti.go:530] starting the source uploading ...
I0327 19:44:42.243301 1 docker.go:411] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0327 19:44:42.243338 1 docker.go:466] Base directory for STI scripts is '/usr/libexec/s2i'. Untarring destination is '/opt/s2i/destination'.
I0327 19:44:42.243365 1 docker.go:616] Creating container using config: &{Hostname: Domainname: User: Memory:0 MemorySwap:0 MemoryReservation:0 KernelMemory:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:true AttachStderr:false PortSpecs:[] ExposedPorts:map[] StopSignal: Tty:false OpenStdin:true StdinOnce:true Env:[OPENSHIFT_BUILD_SOURCE=https://github.com/bparees/openshift-jee-sample.git OPENSHIFT_BUILD_REFERENCE=master OPENSHIFT_BUILD_NAME=sample-jee-app-1 OPENSHIFT_BUILD_NAMESPACE=axep] Cmd:[/bin/sh -c tar -C /opt/s2i/destination -xf - && /usr/libexec/s2i/assemble] DNS:[] Image:openshift/wildfly-100-centos7@sha256:7127899a3cb475e2a160b9a932147fb0c8d6c21a87f16fcb625d8a880537aac8 Volumes:map[] VolumeDriver: VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Mounts:[] Labels:map[]}, hostconfig: &{Binds:[] CapAdd:[] CapDrop:[KILL MKNOD SETGID SETUID SYS_CHROOT] GroupAdd:[] ContainerIDFile: LxcConf:[] Privileged:false PortBindings:map[] Links:[] PublishAllPorts:false DNS:[] DNSOptions:[] DNSSearch:[] ExtraHosts:[] VolumesFrom:[] NetworkMode:container:f17d3c09dfec8f42cc7df82d716a6e2a76d0e213959e3b086224ba98c1ae7b09 IpcMode: PidMode: UTSMode: RestartPolicy:{Name: MaximumRetryCount:0} Devices:[] LogConfig:{Type: Config:map[]} ReadonlyRootfs:false SecurityOpt:[] CgroupParent: Memory:9223372036854775807 MemorySwap:9223372036854775807 MemorySwappiness:0 OOMKillDisable:false CPUShares:2 CPUSet: CPUSetCPUs: CPUSetMEMs: CPUQuota:-1 CPUPeriod:100000 BlkioWeight:0 Ulimits:[] VolumeDriver: OomScoreAdj:0}
I0327 19:44:42.617109 1 docker.go:623] Attaching to container
I0327 19:44:42.620547 1 docker.go:629] Starting container
I0327 19:44:59.135018 1 cleanup.go:23] Removing temporary directory /tmp/s2i-build416721069
I0327 19:44:59.135564 1 fs.go:156] Removing directory '/tmp/s2i-build416721069'
F0327 19:44:59.154433 1 builder.go:204] Error: build error: API error (500): Cannot start container 4d83d07a426cc97232eb49bcc4292205d2307240cb4d6060632d1d018e896a20: [8] System error: write /sys/fs/cgroup/memory/system.slice/docker-4d83d07a426cc97232eb49bcc4292205d2307240cb4d6060632d1d018e896a20.scope/memory.limit_in_bytes: invalid argument

commented

@bparees : I wanted to know if I'm missing anything? I'm using your source.

this is a known issue w/ the latest docker packages which has been fixed in origin here:
openshift/origin#8202

strictly speaking the fix is in the openshift/origin-sti-builder and openshift/origin-docker-builder images, so if you pull the latest versions of those images to your nodes, and run your openshift start with the "--latest-images" flag (which causes the system to use the latest image available on the node) you should be able to get things working.

(oh and it's definitely not related to openshift-sdn, but hopefully my above advice will get you going and we can just close this out rather than worry about moving it)

commented

Yes. We can definitely close this as you mentioned I could pass in an option their to get the latest images: Would I do this on the master. I'm suspecting this is the file I'll be updating as origin-master systemd service loads options from here:

/etc/sysconfig/origin-master

[root@master system]# cat /etc/sysconfig/origin-master 
OPTIONS=--loglevel=2   --latest-images   
CONFIG_FILE=/etc/origin/master/master-config.yaml

# Proxy configuration
# Origin uses standard HTTP_PROXY environment variables. Be sure to set
# NO_PROXY for your master
#NO_PROXY=master.example.com
#HTTP_PROXY=http://USER:PASSWORD@IPADDR:PORT
#HTTPS_PROXY=https://USER:PASSWORD@IPADDR:PORT
[root@master system]# cat origin-master.service 
[Unit]
Description=Origin Master Service
Documentation=https://github.com/openshift/origin
After=network.target
After=etcd.service
Before=origin-node.service
Requires=network.target

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/origin-master
Environment=GOTRACEBACK=crash
ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS
LimitNOFILE=131072
LimitCORE=infinity
WorkingDirectory=/var/lib/origin/
SyslogIdentifier=origin-master
Restart=always

[Install]
WantedBy=multi-user.target
WantedBy=origin-node.service
commented

Even after trying that I'm getting a different error:
F0328 07:49:27.970687 1 builder.go:204] Error: build error: Failed to push image. Response from registry is: Put http://172.30.28.203:5000/v1/repositories/loans/cakephp-example/: dial tcp 172.30.28.203:5000: no route to host

I'm suspecting that my registry isn't setup correctly. I'll work on this some more but just wanted to double check if this is a bug?

Not sure about the issue on the top, but there is no bug for the connectivity to the docker registry.
Please make sure your docker registry pod is running well. You could be able to access your docker-registry on any of your nodes or any of the running pod with curl 172.30.28.203:5000/v2/ (the ip may change if you re-created registry).

commented

On the master:

[root@master centos]# curl 172.30.28.203:5000/v2/
curl: (7) Failed connect to 172.30.28.203:5000; No route to host

on node1:

[root@node1 ~]# curl 172.30.28.203:5000/v2/
curl: (7) Failed connect to 172.30.28.203:5000; No route to host

on node 2:

[root@node2 ~]# curl 172.30.28.203:5000/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}

In conclusion it looks like the registry was setup on node 2 and not reachable via curl on any of the other pods.

I suspect I'll need to setup a router.

ok, then the docker-registry is running well on your node2.
So it should be some networking configuration problem for your cluster.
I suggest you to check
a) if the iptables rules for the docker registry in nat table exist iptables -nL -t nat | grep docker-registry
b) if the udp port 4789 on all the nodes is opened in default table iptables -nL | grep 4789

Sounds like you're past the cgroups issue, over to @knobunc to triage the network config.

commented

Yes. thank you. I did get past the cgroup problem. Hoping to get some help on the network/sdn?

https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh is a script you can run that will gather data about the cluster that can help us to debug. (It has to be run from a machine/account that can ssh as root to the master and each node.)

Did you reinstall everything after fixing the previous problem? This seems like the sort of thing that might just boil down to there being some lingering leftover brokenness from the previous bug.

@zmhassan could you try to restart one of your node (where the curl if failing) with :

systemctl stop atomic-openshift-node docker
rm /run/openshift-sdn
systemctl restart iptables openvswitch
systemctl start atomic-openshift-node

And try your curl again