skiffos / SkiffOS

Any Linux distribution, anywhere.

Home Page:https://skiffos.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WSL: timed out waiting for init process to start

clayauld opened this issue · comments

After building SkiffOS for WSL and importing the tar.gz archive according to the guide, the system will not start. The error are shown in the screenshot.

I have confirmed that I imported the archive as WSL version 2.

image

I should add that this is built from the latest release version.

WSL has evolved enough now that the skiff config will need to be changed a bit. Probably there's a better way now to start systemd than the workaround used here.

I'm going to have to take a deeper look at this and see the best way to address this issue.

Any updates here? Wondering if there's anything I can help with here. I'd like to get a version of this in WSL for testing purposes.

Hey @clayauld I'll boot up my windows machine and try to find a fix this evening.

Let me know what I can do to help!

I'm not sure about today, but in the past it was not possible to set a process as the "init process" in WSL2, so the workaround in skiff is to start systemd as a side effect of the first command line invocation (wsl.exe)

I'm still looking into a solution to this but managed to get the windows machine all updated & will continue looking tomorrow & prioritize it this week

Interesting digging into this.... technically WSL doesn't officially support systemd and most WSL distros don't use it. Here's a screenshot from my favorite, Pengwin.

image

However, there is this: https://github.com/arkane-systems/genie

This makes me think that full compatibility with WSL would require reworking SkiffOS for WSL to remove systemd as a dependency and start the services individually. But......that seems like a LOT of work and likely to break many things.

No, it doesn't need to be reworked. The current setup was working in the past to start up systemd the first time wsl starts. I just need to adjust the script slightly for recent changes of wsl.

According to https://docs.microsoft.com/en-us/windows/wsl/wsl-config#boot-settings

/etc/wsl.conf - you can specify [boot] command with an initial process to run on startup.

in this case we use /boot/skiff-init/skiff-init-squashfs

The startup process is:

  1. wsl.exe
  2. WSL starts /bin/bash which is actually wsl-shell (C code in this repo)
  3. wsl-shell waits for a pid file to exist
  4. skiff-init-squashfs mounts the squashfs & overlayfs & chroots into it
  5. skiff-init-squashfs starts /wsl-init.sh inside the chroot
  6. skiff-init.sh sets up some mounts & starts systemd
  7. skiff-init-squashfs write the systemd pid to the file
  8. wsl-shell sees the pid file & enters the namespaces of systemd

So it's a bit complicated but it worked reliably previously. I suspect that some changes made to skiff-init-squashfs broke it somewhere along the way.

To debug this, I'll undo the symlink to wsl-shell to have it instead run stock bash, then run /boot/skiff-init/skiff-init-squashfs manually and look at the logs to see why it is failing.

Will try to do this today.

So I see here that it says The Boot setting is only available on Windows 11.

One possible workaround is to check if boot ran or not, and if not, instead start it using wsl-shell on first run

#224 has that workaround, needs more testing

I've been swamped with other things but I'll test #224 when I can.