jizg / containers-from-scratch

Learning Go and containers by re implementing https://github.com/lizrice/containers-from-scratch step by step

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

containers-from-scratch

Learning Go programming and linux containerization by reimplementing https://github.com/lizrice/containers-from-scratch step by step

Notice: Every step was tagged in this repository. You can jump to the final state of each step by the following git command:

git checkout -f stepx

Step1. Setup run function

git checkout -f step1

insert run() function, print out all arguments after the third one os.Args[2:]

$go run main.go run Hello, world 
Running [Hello, world]

Step2. Modify run function to execute the command in arguments

git checkout -f step2

modify run() function to enable executing the command in arguments

$go run main.go run echo Hello, world 
Running [echo Hello, world]
Hello, world

Step3. Modify run function to enable UTS Namespace

git checkout -f step3

By enabling UTS Namespace, the hostname can be changed without affecting host process hostname.

root@ubuntu18:$go run main.go run /bin/bash
Running [/bin/bash]
root@ubuntu18:$hostname
ubuntu18
root@ubuntu18:$hostname container
root@ubuntu18:$hostname
container

while in parent shell hostname remains unchanged:

root@ubuntu18:$hostname
ubuntu18

Step4. Add child function in order to change hostname before entering bash

git checkout -f step4

In order to display an updated hostname in bash, add a child function and let run to execute this child in the new UTS namespace with updated hostname.

root@ubuntu18:$go run main.go run /bin/bash
Running [/bin/bash]
Running in child [/bin/bash]
root@container:$hostname
container

And if you run ps in this containerized bash, you still can find all parent processes of this bash process, and the PIDs are also still the number of PIDs in host OS, which means still not containerized completely.

root@container:$ps
$ ps
  PID TTY          TIME CMD
27070 pts/10   00:00:00 sudo
27071 pts/10   00:00:00 bash
27823 pts/10   00:00:00 go
27844 pts/10   00:00:00 main
27848 pts/10   00:00:00 exe
27852 pts/10   00:00:00 bash
28292 pts/10   00:00:00 ps
root@container:$ps fax
...
27070 pts/10   S      0:00      |   |   \_ sudo /bin/bash
27071 pts/10   S      0:00      |   |       \_ /bin/bash
27823 pts/10   SLl    0:00      |   |           \_ go run main.go run /bin/bash
27844 pts/10   SLl    0:00      |   |               \_ /tmp/go-build645762770/b0
27848 pts/10   SLl    0:00      |   |                   \_ /proc/self/exe child 
27852 pts/10   S      0:00      |   |                       \_ /bin/bash
28779 pts/10   R+     0:00      |   |                           \_ ps fax

Step5. Modify run function to enable PID Namespace

git checkout -f step5

In run function, let the PID in the containerized bash start from 1 by enabling PID Namespace.

root@ubuntu18:$go run main.go run /bin/bash
Running [/bin/bash] as 30582
Running in child [/bin/bash] as 1

But if you run ps again, you still get all parent processes of this bash process, and the PIDs are still the number of PIDs in host OS. This is beause these processes information are from /proc folder which is not isolated by UTS and PID namespace, and therefore still shared between host and containerized bash.

root@container:$ps
  PID TTY          TIME CMD
30191 pts/10   00:00:00 sudo
30192 pts/10   00:00:00 bash
30561 pts/10   00:00:00 go
30582 pts/10   00:00:00 main
30586 pts/10   00:00:00 exe
30590 pts/10   00:00:00 bash
30818 pts/10   00:00:00 ps

Step6. Set containerized process root directory to a new path

git checkout -f step6

In order to isolate containerized process from sharing with host os /proc, use chroot and chdir syscall to set a new root dir for containerized process. Firstly, use the following command to prepare a clean ubuntu filesystem.

$ CID=$(docker create ubuntu)
$ ROOTFS=~/ubuntufs
$ docker export $CID | tar -xf - -C $ROOTFS

In child function, call chroot and chdir syscall to set root direcotry to the ubuntufs folder. You can check the new root for containerized process by the following commands.

In containerized bash. You can also find that ps will get error, as there is still nothing under /proc.

root@container:$sleep 100
root@container:$ps
Error, do this: mount -t proc proc /proc

In host OS bash.

root@ubuntu18:$ps -C sleep
PID TTY          TIME CMD
 5697 pts/10   00:00:00 sleep
root@ubuntu18:$ls -l /proc/5697/root
lrwxrwxrwx 1 root root 0 Apr 28 23:36 /proc/5697/root -> /home/jizg/ubuntufs

So we actually use the extracted ubuntu:latest image on docker hub as our new root directory.

Step7. Enable Mount Namespace in run function, and mount /proc to containerized process in child function

git checkout -f step7

By mount /proc to the new root file system, ps will only display processes in the containerized process.

root@container:$ps
  PID TTY          TIME CMD
    1 ?        00:00:00 exe
    5 ?        00:00:00 bash
    7 ?        00:00:00 ps

In host OS.

root@ubuntu18:$mount | grep proc
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=25,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13556)
proc on /home/jizg/ubuntufs/proc type proc (rw,relatime)

After adding unshare flags for Mount Namespace, the new root directory information will not be exposed to host OS. Hence the mount | grep proc in host OS will not return mounting info about /proc in containerized process.

Step8. Add pids cgroup(process number controller) to child function to limit the max process number to 20 for containerized process environment

git checkout -f step8

Add a new function cg to config the pids cgroup and set max process number to 20, call cg in child.

root@container:$ps
  PID TTY          TIME CMD
    1 ?        00:00:00 exe
    5 ?        00:00:00 bash
    7 ?        00:00:00 ps
root@container:$sleep 100

In host OS.

root@ubuntu18:$ps -C sleep
PID TTY          TIME CMD
30048 pts/10   00:00:00 sleep
root@ubuntu18:$cd /sys/fs/cgroup/pids/jizg
root@ubuntu18:$cat cgroup.procs
30038
30045
30048

And you can run :() { : | : & }; : (a fork bomb) in the containerized bash to fork new process endless, and eventually failed when total process number reaches 20.

About

Learning Go and containers by re implementing https://github.com/lizrice/containers-from-scratch step by step

License:Apache License 2.0


Languages

Language:Go 100.0%