groupon / DotCi

DotCi Jenkins github integration, .ci.yml http://groupon.github.io/DotCi

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docker Compose builds are leaking containers in one case

srlochen opened this issue · comments

We've seen one case where a Docker compose build was leaking containers that were not cleaned up after the builds finished. These containers continued to hang around, and overtime accumulated so much that they choked the build host for resources.

In the one case we saw, the job orchestrated a numbed of containers with Docker Compose, but only one container stuck around. In this case it appeared that the build was trying to kill the container but was using the wrong string for the container name. The build showed the following command:

trap "docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 kill; docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 rm -v --force; exit" PIPE QUIT INT HUP EXIT TERM

But the host on which it was running had the following container:
# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2fff59259659 cibackbeatbackbeatserver274_ci "bin/run-tests.sh" 4 minutes ago Up 15 seconds

Note the difference between cibackbeatbackbeatserver274 and cibackbeatbackbeatserver274_ci

We need to close this gap so containers don't hang around indefinitely.

src/main/java/com/groupon/jenkins/buildtype/dockercompose/BuildConfiguration.java#L75 needs investigation why container service ci was not properly clean up ... ( aka cibackbeatbackbeatserver274_ci )

...
$  trap "docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 kill; docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 rm -v --force; exit" PIPE QUIT INT HUP EXIT TERM
$  docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 
...
Killing cibackbeatbackbeatserver274_database_1...
Killing cibackbeatbackbeatserver274_redis_1...
Removing cibackbeatbackbeatserver274_redis_1...
Removing cibackbeatbackbeatserver274_database_1...
Going to remove cibackbeatbackbeatserver274_database_1, cibackbeatbackbeatserver274_redis_1

wonder if docker/compose#1651 resolves this issue as-of docker-compose >= 1.4.0

Appears issue is using docker-compose run instead of docker-compose up ...
this may be a legit new docker-compose bug based on

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
                 Name                               Command               State    Ports
------------------------------------------------------------------------------------------
cibackbeatbackbeatserver274_ci_run_1     bin/run-tests.sh                 Up
cibackbeatbackbeatserver274_database_1   /docker-entrypoint.sh postgres   Up      5432/tcp
cibackbeatbackbeatserver274_redis_1      /entrypoint.sh redis-server      Up      6379/tcp

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 stop
Stopping cibackbeatbackbeatserver274_database_1... done
Stopping cibackbeatbackbeatserver274_redis_1... done

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 stop ci

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 kill ci

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 rm -v --force ci
No stopped containers

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
                Name                       Command        State   Ports
-----------------------------------------------------------------------
cibackbeatbackbeatserver274_ci_run_1   bin/run-tests.sh   Up

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 kill

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
                 Name                               Command                State     Ports
------------------------------------------------------------------------------------------
cibackbeatbackbeatserver274_ci_run_1     bin/run-tests.sh                 Up
cibackbeatbackbeatserver274_database_1   /docker-entrypoint.sh postgres   Exit 137
cibackbeatbackbeatserver274_redis_1      /entrypoint.sh redis-server      Exit 0
GM28126:backbeat_server victorv

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 rm -v --force
Going to remove cibackbeatbackbeatserver274_database_1, cibackbeatbackbeatserver274_redis_1
Removing cibackbeatbackbeatserver274_database_1... done
Removing cibackbeatbackbeatserver274_redis_1... done

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
                Name                       Command        State   Ports
-----------------------------------------------------------------------
cibackbeatbackbeatserver274_ci_run_1   bin/run-tests.sh   Up

the solution appears to be

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
                Name                       Command        State   Ports
-----------------------------------------------------------------------
cibackbeatbackbeatserver274_ci_run_1   bin/run-tests.sh   Up

$ docker stop cibackbeatbackbeatserver274_ci_run_1
cibackbeatbackbeatserver274_ci_run_1

$ docker kill cibackbeatbackbeatserver274_ci_run_1

$ docker rm -v --force cibackbeatbackbeatserver274_ci_run_1

$ docker-compose -f docker-compose.yml -p cibackbeatbackbeatserver274 ps
Name   Command   State   Ports
------------------------------

this is a docker-compose issue .... created docker/compose#2184

Because docker-compose [ps | logs | rm] does not identify availability of the container from docker-compose run SERVICE

Using after_each: as-of DotCi-2.22.26.0.

I was able to specifically targeted the container. It is unfortunately this solution involves hard coding the service name. Therefore, a more generic solution is unavailable unless it is part of the docker compose build type.

run:
  master:

after_each: 'docker rm -vf <% out<< JOB_NAME.toLowerCase().replaceAll('[^0-9a-z]','') %><% out<< BUILD_NUMBER %>_master_run_1'

Ideally a docker compose bug should be created to address this issue. This issue currently exists as-of

docker-compose version 1.5.2, build 7240ff3

Going to wait for docker/compose#2593

Determined that PR below is proper solution to change the bash trap to docker-compose rm --all --force