tiredofit / docker-traefik-cloudflare-companion

Automatically Create CNAME records for containers served by Traefik

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Randomly stopped working, log is full of errors

tehniemer opened this issue · comments

I had this working for quite some time, but it seems to have failed recently and I'm not sure why. The logs are full of errors like these and recreating the container doesn't fix it.

urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')),
    sock.connect(self.unix_socket),
  File "/usr/lib/python3.8/site-packages/docker/transport/unixconn.py", line 43, in connect,
    self.connect(),
  File "/usr/lib/python3.8/http/client.py", line 950, in send,
    self.send(msg),
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output,
    self._send_output(message_body, encode_chunked=encode_chunked),
    self.endheaders(body, encode_chunked=encode_chunked),
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders,
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request,
    self._send_request(method, url, body, headers, encode_chunked),
  File "/usr/lib/python3.8/http/client.py", line 1255, in request,
    conn.request(method, url, **httplib_request_kw),
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request,
    httplib_response = self._make_request(,
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen,
    raise value.with_traceback(tb),
  File "/usr/lib/python3.8/site-packages/urllib3/packages/six.py", line 734, in reraise,
    raise six.reraise(type(error), error, _stacktrace),
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 403, in increment,
    retries = retries.increment(,
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen,
    resp = conn.urlopen(,
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send,
Traceback (most recent call last):

Can you set CONTAINER_LOG_LEVEL=DEBUG and lets see some more logs?

today at 1:31 PM Traceback (most recent call last):
today at 1:31 PM File "/usr/sbin/cloudflare-companion", line 240, in <module>
today at 1:31 PM client= docker.from_env()
today at 1:31 PM File "/usr/lib/python3.8/site-packages/docker/client.py", line 84, in from_env
today at 1:31 PM return cls(
today at 1:31 PM File "/usr/lib/python3.8/site-packages/docker/client.py", line 40, in __init__
today at 1:31 PM self.api = APIClient(*args, **kwargs)
today at 1:31 PM File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 188, in __init__
today at 1:31 PM self._version = self._retrieve_server_version()
today at 1:31 PM File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 212, in _retrieve_server_version
today at 1:31 PM raise DockerException(
today at 1:31 PM docker.errors.DockerException: Error while fetching server API version: HTTPConnectionPool(host='socket-proxy', port=2375): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe72b6d5340>: Failed to establish a new connection: [Errno -2] Name does not resolve'))
today at 1:31 PM + PROCESS_NAME=traefik-cloudflare-companion
today at 1:31 PM + check_container_initialized
today at 1:31 PM + print_debug 'Checking to see if container initialization scripts have completed'
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM [DEBUG] /etc/services.available/10-cloudflare-companion/run ** [traefik-cloudflare-companion] Checking to see if container initialization scripts have completed
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM + check_service_initialized init
today at 1:31 PM + print_debug 'Checking to see if service has initialized'
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM [DEBUG] /etc/services.available/10-cloudflare-companion/run ** [traefik-cloudflare-companion] Checking to see if service has initialized
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM + liftoff
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM + print_info 'Starting Traefik Cloudflare Companion'
today at 1:31 PM + output_off
today at 1:31 PM + '[' true = TRUE ']'
today at 1:31 PM + '[' true = true ']'
today at 1:31 PM + set +x
today at 1:31 PM [INFO] /etc/services.available/10-cloudflare-companion/run ** [traefik-cloudflare-companion] Starting Traefik Cloudflare Companion
today at 1:31 PM + exec python3 -u /usr/sbin/cloudflare-companion
today at 1:31 PM Traceback (most recent call last):
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
today at 1:31 PM conn = connection.create_connection(
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 61, in create_connection
today at 1:31 PM for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
today at 1:31 PM File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
today at 1:31 PM for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
today at 1:31 PM socket.gaierror: [Errno -2] Name does not resolve
today at 1:31 PM
today at 1:31 PM During handling of the above exception, another exception occurred:
today at 1:31 PM
today at 1:31 PM Traceback (most recent call last):
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
today at 1:31 PM httplib_response = self._make_request(
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request
today at 1:31 PM conn.request(method, url, **httplib_request_kw)
today at 1:31 PM File "/usr/lib/python3.8/http/client.py", line 1255, in request
today at 1:31 PM self._send_request(method, url, body, headers, encode_chunked)
today at 1:31 PM File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
today at 1:31 PM self.endheaders(body, encode_chunked=encode_chunked)
today at 1:31 PM File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
today at 1:31 PM self._send_output(message_body, encode_chunked=encode_chunked)
today at 1:31 PM File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
today at 1:31 PM self.send(msg)
today at 1:31 PM File "/usr/lib/python3.8/http/client.py", line 950, in send
today at 1:31 PM self.connect()
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
today at 1:31 PM conn = self._new_conn()
today at 1:31 PM File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
today at 1:31 PM raise NewConnectionError(
today at 1:31 PM urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1855c6d400>: Failed to establish a new connection: [Errno -2] Name does not resolve

Looks like something is erroring out in the connection to docker-socket-proxy? That's odd because a few other services are currently accessing it, and it was working with this before.

It looks like it for sure. I assume you are accessing the socket via TCP and port 2375.
One thing I started doing in tag 6.1.0 and onwards was I started talking directly to the low-level docker API breaking away from the Python Module which seemingly didn't give me the results I wanted to especially with swarm mode. The image has seen more development activity than usual especially in the past two weeks.

Would you mind sharing the necessary bits out of your docker-compose, or whatever file? Specifically I'm interested in the DOCKER_* environment variables.
How old is your Docker installation?

Also, in the meantime, if you could head inside the container, and comment out line 241 of /usr/sbin/cloudflare-companion and let me know if that gets you past the issues.

You are correct, here is the snippet from my compose file, I started using your container a little less than a month ago.

  cf-companion:
    container_name: cf-companion
    image: tiredofit/traefik-cloudflare-companion:latest
    restart: always
    networks:
      - proxy
      - socket_proxy
    depends_on:
      - socket-proxy
    security_opt:
      - no-new-privileges:true
    environment:
      TIMEZONE: $TZ
      DOCKER_HOST: 'tcp://socket-proxy:2375'
      TRAEFIK_VERSION: 2
      CF_EMAIL: $CLOUDFLARE_EMAIL
      CF_TOKEN: $CLOUDFLARE_API_KEY
      TARGET_DOMAIN: $DOMAINNAME
      DOMAIN1: $DOMAINNAME
      DOMAIN1_ZONE_ID: $CLOUDFLARE_ZONEID
      DOMAIN1_PROXIED: 'true'
    labels:
      - 'traefik.http.routers.cf-companion-rtr.rule=Host(`$DOMAINNAME`) || Host(`www.$DOMAINNAME`) || Host(`blueiris.$DOMAINNAME`) || Host(`openmediavault.$DOMAINNAME`) || Host(`proxmox.$DOMAINNAME`)'

still many errors after commenting out line 241

  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request,
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 543, in get,
    return self.get(url, **self._set_request_timeout(kwargs)),
    return self._result(self._get(url), json=True),
  File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 228, in _get,
During handling of the above exception, another exception occurred:,
  File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 205, in _retrieve_server_version,
    return self.version(api_version=False)["ApiVersion"],
  File "/usr/lib/python3.8/site-packages/docker/api/daemon.py", line 181, in version,
  File "/usr/lib/python3.8/site-packages/docker/utils/decorators.py", line 46, in inner,
    return f(self, *args, **kwargs),
    return self.request('GET', url, **kwargs),
    resp = self.send(prep, **send_kwargs),
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send,
    r = adapter.send(request, **kwargs),
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 516, in send,
    raise ConnectionError(e, request=request),
requests.exceptions.ConnectionError: HTTPConnectionPool(host='socket-proxy', port=2375): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4e884dc220>: Failed to establish a new connection: [Errno -2] Name does not resolve')),
  File "/usr/sbin/cloudflare-companion", line 240, in <module>,
    client= docker.from_env(),
  File "/usr/lib/python3.8/site-packages/docker/client.py", line 84, in from_env,
    return cls(,
  File "/usr/lib/python3.8/site-packages/docker/client.py", line 40, in __init__,
    self.api = APIClient(*args, **kwargs),
  File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 188, in __init__,
    self._version = self._retrieve_server_version(),
  File "/usr/lib/python3.8/site-packages/docker/api/client.py", line 212, in _retrieve_server_version,
    raise DockerException(,
docker.errors.DockerException: Error while fetching server API version: HTTPConnectionPool(host='socket-proxy', port=2375): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4e884dc220>: Failed to establish a new connection: [Errno -2] Name does not resolve')),
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='socket-proxy', port=2375): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4e884dc220>: Failed to establish a new connection: [Errno -2] Name does not resolve')),
    raise MaxRetryError(_pool, url, error or ResponseError(cause)),
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment,
    retries = retries.increment(,
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen,
    resp = conn.urlopen(,
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send,
,
Traceback (most recent call last):,
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f4e884dc220>: Failed to establish a new connection: [Errno -2] Name does not resolve,
    raise NewConnectionError(,
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn,
    conn = self._new_conn(),
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect,
    self.connect(),
  File "/usr/lib/python3.8/http/client.py", line 950, in send,
    self.send(msg),
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output,
    self._send_output(message_body, encode_chunked=encode_chunked),
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders,
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request,
    self.endheaders(body, encode_chunked=encode_chunked),
    self._send_request(method, url, body, headers, encode_chunked),
  File "/usr/lib/python3.8/http/client.py", line 1255, in request,
    conn.request(method, url, **httplib_request_kw),
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request,
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen,
    httplib_response = self._make_request(

So bizarre. Looks like it's having troubles finding the socket - which in reality it shouldn't have that problem.
A month ago you were using the 4.x series.
It was a Bash Script that built a python script based on the amount of domains you had, then executed the python script.
5.x was very short lived, it brought in Updating of Domain records, a few tricks with Docker secrets, and very basic swarm mode. It used the same techniques as version 4.x

6.x.x I rewrote the script in pure python - The underpinnings are still the same - We're still calling Docker the same way other than this API call which is really only used when Docker Swarm is in use. To be a pain, can I make sure you are using:

tiredofit/traefik-cloudflare-companion:6.0.2 - And see if you get same results? followed by:
tiredofit/traefik-cloudflare-companion:5.0.0 - and finally
tiredofit/traefik-cloudflare-companion:4.2.1 ....

I'd also be curious on the output of docker images |grep traefik-cloudflare-companion to see what hashes you were actually using for the working version and we can attempt to work in reverse to see whats going on here.

And finally - I hate to ask this one.. When was the last time you restarted your docker service?

The docker service was restarted this morning after the container was recreated with no change in behavior.

root@Docker:/opt/docker# docker images |grep traefik-cloudflare-companion
tiredofit/traefik-cloudflare-companion       latest              e333c68f9bd9        2 days ago          339MB

tiredofit/traefik-cloudflare-companion:6.0.2 works as expected with no changes to my compose file.

Thanks for the update. I'll try to pull a copy of latest and see what's happening there and repush. Docker hub sometimes does some real amazing stuff when I am pushing alot of images but this certainly is a new one for me. I'll try to take care of this tomorrow and glad to know you are back in business.

Great, let me know when I should try latest again. 👍

I still feel its the API call that I am making which is causing this breakdown. I've added another check to only call it if SWARM_MODE=TRUE to the most recent :latest image. Can you let me know how you fare?

It looks as if I'm going to have to stay within constraints of the Python Docker module as it contains some of the calls I am making to get information on Container IDs vs Services for Swarm.

just pulled latest and it's working as expected again, thanks!

Thanks for helping me diagnose this.