Roblox / nomad-driver-containerd

Nomad task driver for launching containers using containerd.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running with Nomad inside containerd

the-maldridge opened this issue · comments

I'm interested in supporting this driver within the ResinStack distribution that I have developed for a more readily deployable version of the nomad ecosystem. In this environment I have nomad itself running as a containerd task, and I'm trying to work out either what needs to be mounted in, or if I can change the mount paths. Right now I'm hung up on this error and would appreciate advice:

2022-01-27T22:10:45-06:00  Driver Failure  rpc error: code = Unknown desc = Error in creating container: failed to mount /tmp/containerd-mount2802059906: no such file or directory

/tmp from the host is available to the container, so I'm not really sure what's wrong here.

@the-maldridge Do you have a job spec?

Ask and ye shall receive:

job "proxy" {
  name = "proxy"
  datacenters = ["minicluster-control"]
  type = "system"

  group "traefik" {
    network {
      mode = "host"
      port "http" { static = 80 }
      port "metrics" { static = 8080 }

      dns {
        servers = ["127.0.0.1"]
      }
    }

    service {
      port = "http"
      check {
        type = "http"
        path = "/ping"
        port = "metrics"
        address_mode = "host"
        interval = "15s"
        timeout = "2s"
      }
      connect {
        native = true
      }
    }

    task "traefik" {
      driver = "containerd-driver"

      config {
        image = "traefik:v2.5.2"
 
        args = [
          "--accesslog=true",
          "--api.dashboard",
          "--api.insecure=true",
          "--entrypoints.http.address=:80",
          "--entrypoints.traefik.address=:8080",
          "--metrics.prometheus",
          "--pilot.dashboard=false",
          "--ping=true",
          "--providers.file.filename=/local/dynamic.toml",
          "--providers.consulcatalog.connectaware=true",
          "--providers.consulcatalog.connectbydefault=true",
          "--providers.consulcatalog.servicename=proxy-traefik",
          "--providers.consulcatalog.defaultrule=Host(`{{normalize .Name}}.mc`)",
          "--providers.consulcatalog.exposedbydefault=false",
          "--providers.consulcatalog.endpoint.address=127.0.0.1:8500",
        ]
      }

      template {
        data=<<EOF
[http]
  [http.routers]
    [http.routers.nomad]
      entryPoints = ["http"]
      service = "nomad"
      rule = "Host(`nomad.mc`)"
    [http.routers.consul]
      entryPoints = ["http"]
      service = "consul"
      rule = "Host(`consul.mc`)"
    [http.routers.vault]
      entryPoints = ["http"]
      service = "vault"
      rule = "Host(`vault.mc`)"
  [http.services]
    [http.services.nomad]
      [http.services.nomad.loadBalancer]
        [[http.services.nomad.loadBalancer.servers]]
          url = "http://nomad.service.consul:4646"
    [http.services.consul]
      [http.services.consul.loadBalancer]
        [[http.services.consul.loadBalancer.servers]]
          url = "http://consul.service.consul:8500"
    [http.services.vault]
      [http.services.vault.loadBalancer]
        [[http.services.vault.loadBalancer.servers]]
          url = "http://active.vault.service.consul:8200"
EOF
        destination = "local/dynamic.toml"
      }

      resources {
        cpu = 500
        memory = 64
      }
    }
  }
}

@the-maldridge Works fine for me!

root@vagrant:~/go/src/github.com/Roblox/nomad-driver-containerd/example# nomad status
ID     Type    Priority  Status   Submit Date
proxy  system  50        running  2022-01-28T06:37:10Z

Logs from Nomad

Jan 28 06:37:10 vagrant nomad[4654]:     2022-01-28T06:37:10.344Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=bfaa8eaa-c9d9-13a1-34bd-4e246171ee89 task=traefik path=/tmp/nomad/alloc/bfaa8eaa-c9d9-13a1-34bd-4e246171ee89/alloc/logs/.traefik.stdout.fifo @module=logmon timestamp=2022-01-28T06:37:10.344Z
Jan 28 06:37:10 vagrant nomad[4654]:     2022-01-28T06:37:10.344Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=bfaa8eaa-c9d9-13a1-34bd-4e246171ee89 task=traefik path=/tmp/nomad/alloc/bfaa8eaa-c9d9-13a1-34bd-4e246171ee89/alloc/logs/.traefik.stderr.fifo @module=logmon timestamp=2022-01-28T06:37:10.344Z
Jan 28 06:37:10 vagrant nomad[4654]:     2022/01/28 06:37:10.349050 [INFO] (runner) creating new runner (dry: false, once: false)
Jan 28 06:37:10 vagrant nomad[4654]:     2022/01/28 06:37:10.349589 [INFO] (runner) creating watcher
Jan 28 06:37:10 vagrant nomad[4654]:     2022/01/28 06:37:10.349908 [INFO] (runner) starting
Jan 28 06:37:10 vagrant nomad[4654]:     2022/01/28 06:37:10.351123 [INFO] (runner) rendered "(dynamic)" => "/tmp/nomad/alloc/bfaa8eaa-c9d9-13a1-34bd-4e246171ee89/traefik/local/dynamic.toml"
Jan 28 06:37:10 vagrant nomad[4654]:     2022-01-28T06:37:10.355Z [INFO]  client.driver_mgr.containerd-driver: starting task: driver=containerd-driver @module=containerd-driver driver_cfg="{Image:traefik:v2.5.2 Command: Args:[--accesslog=true --api.dashboard --api.insecure=true --entrypoints.http.address=:80 --entrypoints.traefik.address=:8080 --metrics.prometheus --pilot.dashboard=false --ping=true --providers.file.filename=/local/dynamic.toml --providers.consulcatalog.connectaware=true --providers.consulcatalog.connectbydefault=true --providers.consulcatalog.servicename=proxy-traefik --providers.consulcatalog.defaultrule=Host(`{{normalize .Name}}.mc`) --providers.consulcatalog.exposedbydefault=false --providers.consulcatalog.endpoint.address=127.0.0.1:8500] CapAdd:[] CapDrop:[] Cwd: Devices:[] Seccomp:false SeccompProfile: ShmSize: Sysctl:map[] Privileged:false PidsLimit:0 PidMode: Hostname: HostDNS:false ImagePullTimeout:5m ExtraHosts:[] Entrypoint:[] ReadOnlyRootfs:false HostNetwork:false Auth:{Username: Password:} Mounts:[{Type:bind Target:/etc/resolv.conf Source:/tmp/nomad/alloc/bfaa8eaa-c9d9-13a1-34bd-4e246171ee89/traefik/resolv.conf Options:[bind ro]}]}" timestamp=2022-01-28T06:37:10.354Z
Jan 28 06:37:35 vagrant nomad[4654]:     2022-01-28T06:37:35.052Z [INFO]  client.driver_mgr.containerd-driver: Successfully pulled docker.io/library/traefik:v2.5.2 image
Jan 28 06:37:35 vagrant nomad[4654]: : driver=containerd-driver @module=containerd-driver timestamp=2022-01-28T06:37:35.052Z
Jan 28 06:37:35 vagrant nomad[4654]:     2022-01-28T06:37:35.284Z [INFO]  client.driver_mgr.containerd-driver: Successfully created container with name: traefik-bfaa8eaa-c9d9-13a1-34bd-4e246171ee89
Jan 28 06:37:35 vagrant nomad[4654]: : driver=containerd-driver @module=containerd-driver timestamp=2022-01-28T06:37:35.284Z
Jan 28 06:37:35 vagrant nomad[4654]:     2022-01-28T06:37:35.524Z [INFO]  client.driver_mgr.containerd-driver: Successfully created task with ID: traefik-bfaa8eaa-c9d9-13a1-34bd-4e246171ee89
Jan 28 06:37:35 vagrant nomad[4654]: : driver=containerd-driver @module=containerd-driver timestamp=2022-01-28T06:37:35.523Z

Nomad alloc logs

root@vagrant:~/go/src/github.com/Roblox/nomad-driver-containerd/example# nomad alloc logs -f bfaa8eaa
time="2022-01-28T06:37:40Z" level=info msg="Configuration loaded from flags."

Yes I expect on an un-namespaced system it would. They key point of my question though is that nomad is itself running under containerd in an isolated mount namespace. I want to know what paths from the host I need to map for nomad to be able to use the containerd driver.

@the-maldridge I am not sure if I completely follow your question. When you say un-namespaced system, are you talking about Nomad namespaces or Linux namespaces?

What do you mean by nomad is itself running under containerd? Are you trying to run Nomad-in-Nomad like DIND?
As in you have a Nomad server which launches a container (c1) using containerd-driver, and you want to run Nomad inside that container, c1?

Nomad (s1) ---> containerd-driver ----> c1 [Nomad (s2)]

That's fair, this is a slightly unorthodox environment and I haven't really explained it that well. In my environment I am using linuxkit/linuxkit to build my machine images, and this means that the init and supervision system at the OS layer is containerd. This means that nomad is itself a task being started and managed by containerd with filesystem isolation. What I want to do is use the containerd-driver to have nomad interact with the host containerd in much the same way that binding the docker socket into a container allows that container to start additional docker containers on the host.

So to recap, what I have is:

init-shim --> containerd --> nomad

And what I want to do is be able to do this:

init-shim --> containerd --> nomad
                         \-> my-nomad-alloc

To do this with a dockerd that's running adjacent to nomad I bind the following paths to Nomad's mount namespace:

    "/etc/nomad:/etc/nomad",
    "/etc/resolv.cluster:/etc/resolv.conf",
    "/lib/modules:/lib/modules",
    "/run:/run:rshared",
    "/service:/service",
    "/usr/bin/runsv:/usr/bin/runsv",
    "/var/persist:/var/persist:rshared",
    "/var/run:/var/run:rshared",

The important paths for the docker driver are /run, /lib/modules and /var/persist (the nomad data_dir). It looks like the containerd driver wants to make use of /tmp as well, and rather than playing whack-a-mole with paths I am hopeful there is a well understood set of paths that Nomad and containerd interact with each other through.

Hopefully that makes sense, but please don't hesitate to ask if there's more information I can provide.

@the-maldridge Why not run Nomad as a container in host namespace?

This way your Nomad (running as a containerd container) will have access to host containerd (init-system), and can register the containerd-driver.

Hmm, my apologies as it seems I had not clearly communicated how this was configured.

Nomad is running as a containerd container already, and has access to the containerd on the host. However, like all containerd containers, there is default file system isolation which means there are going to be some directories that nomad needs to be able to share between its namespace and the namespace that the host containerd is going to want to bind in. Mostly this is the data directory that contains all the alloc subdirectories, but it seemed like the nomad driver wanted to use things in /tmp which other drivers do not.

I can crawl the code if the answer to "what directories does the containerd driver need to use" is "we don't know" but I'd hoped for an easy answer to this problem.

I don't think containerd-driver uses anything in /tmp. The only host location containerd-driver needs which I know of is /etc since it needs to setup /etc/hosts and /etc/resolv.conf when setting up the container. There is nothing in /tmp which the driver needs to setup a container.

2022-01-27T22:10:45-06:00  Driver Failure  rpc error: code = Unknown desc = Error in creating container: failed to mount /tmp/containerd-mount2802059906: no such file or directory

The error you posted seems to be coming from containerd when the driver is calling containerd to setup the container.
I think we need to figure out where containerd is looking for that /tmp/containerd-mount2802059906 as it doesn't seem to be the host /tmp. Most likely it's the container rootfs which is mounted somewhere on the host.

Most container supervisors (e.g. containerd) setup a block device (and a file system on top of that) which is mounted somewhere on the host, and when your container process is started, your container PID1 will be pivot_root to that rootfs location, instead of host /. That's how they achieve file system isolation.

I would put a sleep in the driver when this error happens so that things don't get cleaned up right away. and try to look for that file /tmp/containerd-mount2802059906 and see what's the actual host path for this?