enhancement: systemd-nspawn to launch real Image= container

Question

enhancement: systemd-nspawn to launch real Image= container

rektide opened this issue 4 years ago · comments

using systemd-nspawn would let us run real containers, under systemd. from the man page's description:

systemd-nspawn may be used to run a command or OS in a light-weight namespace container. In many ways it is similar to chroot(1), but more powerful since it fully virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems and the host and domain name.).

it would be neat to add a mode to systemk that allows us to use systemd-nspawn to run full containers. Image= could be real images. systemd-nspawn supports a variety of network modes, such as macvlan and ipvlan. there is an machinectl shell which would enable runInContainer, a systemk feature request tracked in #37. pretty specific/fancy stuff but there's also things like the ability to clone a btrfs subvolume & launch that.

to run a systemd-nspawn container is a couple step process:

create a filesystem in /var/lib/machines/ with the expanded image. systemd's machinectl tool includes import-tar and import-fs helpers which could help load images. to fetch an image, we could use docker save, docker export, or something like what nspawn-oci does to load the image ( using skopeo to get & oci-image-tools to expand an image).
run the container. this can be done ephemerally, or we can create a unit.
a. use systemd-nspawn to run the container once off, or
b. create a /etc/systemd/nspawn/my-service.nspawn unit file to configure a container then run machinectl start to start it, in a managed way.

note that two years ago systemd seemed interested in becoming a runc compatible runtime, and if that happens my understanding is we could just run containerd directly against it, which might be a better idea than adapting systemk.

background: current systemk architecture

some notes i took, investigating how systemk works now

kubernetes pods are backed by systemd .service units created by unit manager.

these systemd services use the local system to run. Image= refers to debian packages that are run on the system.

Pires · Answer 1 · Wed Jan 27 2021 21:29:36 GMT+0800 (China Standard Time)

TL;DR is that's interesting but then why not just run a kubelet + CRI-compatible container runtime, eg containerd?

Philosophical question aside, I do think the feature requested above is pretty doable. However, we would raise a problem where systemk provides different UX for containerized workloads vs non-containerized, eg RunInContainer which is not possible (yet?) in the latter form of a workload.

Miek Gieben · Answer 2 · Thu Jan 28 2021 17:05:25 GMT+0800 (China Standard Time)

Can you still run root-less in the above scenario?