worstcase / blockade

Docker-based utility for testing network failures and partitions in distributed applications

Home Page:http://blockade.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Blockade unable to start after image re-build

fbushman opened this issue · comments

I'm testing an image built by the Maven Spring-Boot plugin. The blockade cluster (2 nodes) came up normally. I destroyed the blockade. I changed some code, rebuilt the container, ran blockade up, and this happened.

$ blockade up

Unexpected error! This may be a Blockade bug.

Traceback (most recent call last):
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 617, in main
    opts.func(opts)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 165, in cmd_up
    containers = b.create(verbose=opts.verbose, force=opts.force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 96, in create
    container_id = self._start_container(container, force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 187, in _start_container
    self.docker_client.start(container_id)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/utils/decorators.py", line 21, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/container.py", line 1065, in start
    self._raise_for_status(res)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/client.py", line 216, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/errors.py", line 30, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
APIError: 500 Server Error: Internal Server Error for url: http+docker://localunixsocket/v1.24/containers/5ed75fefe61adeb5aa157b4c3e9fd9f616532c0231992a80bf1932c385865b2d/start ("Cannot link to a non running container: /simplecluster_seedA AS /simplecluster_nodeA/seedA")

I guessed it couldn't hurt to try again:

$ blockade up

Unexpected error! This may be a Blockade bug.

Traceback (most recent call last):
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 617, in main
    opts.func(opts)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 165, in cmd_up
    containers = b.create(verbose=opts.verbose, force=opts.force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 96, in create
    container_id = self._start_container(container, force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 174, in _start_container
    container_id = create_container()
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 170, in create_container
    labels={"blockade.id": self.state.blockade_id})
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/container.py", line 446, in create_container
    return self.create_container_from_config(config, name)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/container.py", line 457, in create_container_from_config
    return self._result(res, True)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/client.py", line 220, in _result
    self._raise_for_status(response)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/client.py", line 216, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/errors.py", line 30, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
APIError: 409 Client Error: Conflict for url: http+docker://localunixsocket/v1.24/containers/create?name=simplecluster_seedA ("Conflict. The container name "/simplecluster_seedA" is already in use by container "c762ab326e9781d11c28689cfaf33c59a12eb4974e7d6ad9b7741a242ba72347". You have to remove (or rename) that container to be able to reuse that name.")

Looking at the error, I noticed it was trying to re-use the name. So, I removed the old containers and tried again. Then the first error came up again.

$ blockade up

Unexpected error! This may be a Blockade bug.

Traceback (most recent call last):
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 617, in main
    opts.func(opts)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/cli.py", line 165, in cmd_up
    containers = b.create(verbose=opts.verbose, force=opts.force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 96, in create
    container_id = self._start_container(container, force)
  File "/home/bushman/.local/lib/python2.7/site-packages/blockade/core.py", line 187, in _start_container
    self.docker_client.start(container_id)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/utils/decorators.py", line 21, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/container.py", line 1065, in start
    self._raise_for_status(res)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/api/client.py", line 216, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/home/bushman/.local/lib/python2.7/site-packages/docker/errors.py", line 30, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
APIError: 500 Server Error: Internal Server Error for url: http+docker://localunixsocket/v1.24/containers/3a1d1486d236730c1cbc5e3006bc4c3eef4d17eba4efa28e11b8567ffa7b4d15/start ("Cannot link to a non running container: /simplecluster_seedA AS /simplecluster_nodeA/seedA")

My blockade.yaml:

containers:
  seedA:
    image: resilience/simple-resilient-cluster:latest
    hostname: seedA
    expose: [8080, 42001]
    ports: {8080: 8080}
    environment: {
      CLUSTER_SEED_HOST: seedA,
      CLUSTER_LOCAL_HOST: seedA,
      RESILIENCE_GOSSIP_TIME: 5000,
      RESILIENCE_GOSSIP_TARGETS: 1,
      RESILIENCE_INFECTION_TIME: 1000,
      RESILIENCE_INFECTION_THRESHOLD: 1,
      RESILIENCE_INFECT_PER_ROUND: 1,
      RESILIENCE_PUSH_TIMEOUT: 500,
      RESILIENCE_PUSH_TARGETS: 1
    }
    volumes: {
      "./logs": "/tmp/resilience_logs"
    }

  nodeA:
    image: resilience/simple-resilient-cluster:latest
    hostname: nodeA
    start_delay: 10
    expose: [8080, 42001]
    ports: {8081: 8080}
    environment: {
      CLUSTER_SEED_HOST: seedA,
      CLUSTER_LOCAL_HOST: nodeA,
      RESILIENCE_GOSSIP_TIME: 5000,
      RESILIENCE_GOSSIP_TARGETS: 1,
      RESILIENCE_INFECTION_TIME: 1000,
      RESILIENCE_INFECTION_THRESHOLD: 1,
      RESILIENCE_INFECT_PER_ROUND: 1,
      RESILIENCE_PUSH_TIMEOUT: 500,
      RESILIENCE_PUSH_TARGETS: 1
    }
    volumes: {
      "./logs": "/tmp/resilience_logs"
    }
    links: [seedA]

Anything I can do here?

I'm running this in a Ubuntu 18.04 Virtual Machine on VMWare Player 17.

$ uname -a
Linux ubuntu 5.4.0-146-generic #163~18.04.1-Ubuntu SMP Mon Mar 20 15:02:59 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ python --version
Python 2.7.17

Okay, this was my bad. My seed node was crashing.