phil-opp / blog_os

Writing an OS in Rust

Home Page:http://os.phil-opp.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tracking Issue for Second Edition

phil-opp opened this issue · comments

Create a second edition of the blog, which reorders the posts (exceptions before page tables) and uses an own bootloader. This should make the posts easier to follow.


Edit: This issue will serve as a tracking issue. The “Second Edition” milestone contains all associated issues and pull requests.

Planned high-level changes

  • Create a custom, dependency free bootloader
  • Work out a new order of posts
    • Remove the sudden jump in difficulty with the “Page Tables” post
    • My current plan: Setup & printing first, then exceptions and hardware interrupts (including keyboard and timer support, and maybe even multithreading), then memory management
  • Move more code into the x86_64 crate
    • Remove low-level implementation details (e.g. how to align a memory address) from the blog text, but …
    • Document the code in the x86_64 crate very well and ecourage readers to look at it

Further changes

  • Link using LLD (#370)
  • “Kernel Heap”: include more allocator types
  • [ ] Put the kernel in the higher half of the address space (see #360 (comment)) maybe sometimes in the future when we introduce linker scripts
  • Unit tests and other tests? (see robert-w-gries/rxinu#32 for an overview of possibilities)
  • Be platform independent: The blog should work on Linux, Mac, and Windows
    • There are CI jobs that verify this.

Other planned things

  • Use rustfmt-nightly so that the code always has standard format. Check formatting via travis.
  • Docker support (#369)
  • Make sure that every code snippet has a comment that specifies the file path (#382).

Maybe


(This list is far from complete and will be extended over time.)

Do you have any plans to re-implement the kernel code in 100% Rust? I noticed that redox manages to use only Rust for kernel code and thought that might be a good goal for the second edition.

Note that there are a dozen calls to the asm! macro in their kernel.

And thank you for the work you've put into the blog! It's been so helpful for both getting me started with a Rust kernel and teaching OSDev in general!

Thanks, glad to hear!

Do you have any plans to re-implement the kernel code in 100% Rust?

Yes, I even plan to write the bootloader in inline assembly, so that you don't need any other tools to build a bootable disk image (CD images are harder, but maybe we will support that too). The only non-Rust tools we will need are a linker (though I hope that the LLVM linker lld is shipped with the Rust compiler soon) and binutils (stuff like objdump or gdb; we only need it for debugging).

This would be super awesome, I literally can't wait for it.

Do you have any plans to put the kernel in the higher half? It's not too hard to do (albeit very easy to get wrong), it's something that isn't terribly well-documented for 64-bit, and it keeps you honest about physical and virtual addresses.

I also think a higher-half design would teach a lot more. I've come across a few bugs in the tutorial code from where physical and virtual addresses got sort of mixed up (due to the identity mapping, these weren't picked up on I'm guessing) and it would set us up nicely for user-land programming!

That's a good idea, thanks for the suggestion. I added it to the list.

I looked into #368 more and determined that it it will be a long time before it can be implemented for blog_os. It should be removed from the Second Edition milestone.

The issue is simply that 64-bit x86 javascript emulators do not exist yet. The most mature js emulator I could find is v86, which only supports 32 bit. The maintainer has no immediate plans to implement 64 bit, but the maintainer is working on a WebAssembly port that will make it possible to support 64 bit mode later.

I still believe that a javascript emulator has value, but it's clear that it won't be relevant to this project for a while. Maybe I'll write one in Rust so we can implement it for blog_os's Tenth Edition 😉

One other thing: the blog currently recommends --gc-sections to deal with missing symbols, but I feel like that's kicking the can down the road rather than solving the problem. The solution ought to be to include compiler-builtins, but for some reason it's missing __floatdisf and __floatundisf.

It appears to be simple enough to add them (one macro invocation each) but there may be a good reason for not doing that which I don't know because I'm way out of my depth; in the event it's just that simple, I have a patch ready to go.

@robert-w-gries Thanks for the update! The lack of 64-bit support is a pity. So I guess we have to wait for the web assembly port.

@ketsuban I think you're right that including compiler_builtins is the better solution, I added it to the list. I don't know about the __floatdisf symbols, so let's see how they react to your issue. It seems like it isn't getting much attention, so maybe ping someone or ask on IRC after the holidays.

I made some progress on the inline assembly bootloader: https://github.com/phil-opp/rustboot-x86. It is still in its early stages, but it already switches to long mode and loads and parses an ELF binary. The next step is to generate page tables to map the kernel to its desired addresses (you will be able to put the kernel to any address you like, including the higher half). After that, we can set up a stack, switch page tables, and jump to the kernel entry point.

Unfortunately, we are not done after that. Our kernel needs some information from the BIOS (most importantly a physical memory map) and this information can be only retrieved from the BIOS in real mode. So we need to create some kind of boot information structure, similar to what GRUB does.

Are you planning on also writing blog posts on how to write the boot loader from scratch, or just have readers use that crate? Personally I would be interested in reading and implementing it myself, but I could imagine some readers wouldn’t be very interested, since it is a lot of assembly code.

I plan to do it similar to the exceptions posts: A mainline post that just uses the crate and an optional post that explains how it works behind the scenes.

More bootloader progress:

I created a tool named bootimage that does the following:

  1. runs xargo build in the current directory, which should be the root directory of a freestanding executable crate
  2. downloads and compiles the pure-Rust bootloader in the target directory
  3. combines the bootloader with the compiled ELF file and outputs a bootable disk image

It is used like this:

> bootimage -o "boot.bin" --target your-target-file
   Compiling rlibc v1.0.0
   Compiling blog_os_rewrite v0.1.0 (file:///…/blog_os_rewrite)
    Finished dev [unoptimized + debuginfo] target(s) in 1.5 secs
Cloning bootloader from https://github.com/phil-opp/rustboot-x86
Compiling bootloader...
   Compiling usize_conversions v0.2.0
   Compiling zero v0.1.2
   Compiling bit_field v0.9.0
   Compiling bitflags v1.0.1
   Compiling ux v0.1.0
   Compiling rlibc v1.0.0
   Compiling xmas-elf v0.6.1
   Compiling x86_64 v0.2.0-alpha-001
   Compiling elf_loader v0.1.0 (file:///…/blog_os_rewrite/target/your-target-file/debug/bootloader)
    Finished release [optimized] target(s) in 22.84 secs
> qemu-system-x86_64 -hda boot.bin

image

The "Hello World!" output comes from the kernel, the other output from the bootloader.


The next step is to create a physical memory map and pass it to the kernel. Then I plan to write the corresponding blog posts (first the mainline post that just uses the crate, and then the posts that explain how it works behind the scenes).

Time for new status update:

  • The bootloader creates and passes a memory map to the kernel now. It even marks all physical frames it uses for e.g. the kernel or page tables as in use in that memory map. So the kernel can freely use any unused areas.
  • I changed the default linker to LLD. So the user has to install it, but there's no need to cross compile anything since LLD is a cross linker by default. Let's hope that LLD is distributed with rustc soon, then it would work out of the box.
  • Unfortunatly the bootloader does not link with LLD. It throws an unknown relocation type: R_X86_64_PC16 error. It's probably because the bootloader combines 64-bit and 16-bit code and LLD has not implemented that yet… To work around this issue I decided to set up an automated travis deploy job that uploads the bootloader binary to github releases. Instead of compiling it, the bootimage crate now just downloads and appends the binary bootloader blob.
  • The effort was not wasted: I was able to create CI jobs that successfully build the disk image of a hello world kernel on Linux, Windows, and Mac (via travis and appveyor).
  • I started writing the two initial blog post of the second version. I hope to be able to publish them in "beta-mode" (i.e. not linked from the front page) next week.

The first two posts of the second edition landed yesterday in #385! They are published in beta mode (not linked from the front page) at https://os.phil-opp.com/second-edition now. Please take a look and tell me what you think :).

Oh, and LLD is very close to be shipped with rustc: rust-lang/rust#48125. As soon as it's in nightly, I plan to add a link to the second edition from the front page and post it to reddit etc.

Hi Phil,

I just read over the first article of the second edition, and wrote down nitpicks that I noticed while reading. I wasn't sure how to properly review since stuff was already checked in, so I created a pull request at #389.

I didn't try the code though, just read the article. Great work!

Cheers,
Andre

@donald-pinckney Thanks, looking forward to your feedback!

@andre-richter Thanks a lot!

Just went through the first page and code, and it worked great (I'm on macOS 10.13). The only writing nitpick that I found after @andre-richter's merge was:

I recommend you to read the Linux section even if you're on a different OS because it is the target we will derive to build our kernel in the next post.

Maybe it isn't a typo, but it didn't make sense: perhaps derive has too much overloaded meaning for me via #[derive()] ;)

The second page and code also went swimmingly for me, except for one problem #391 (I need panic = "abort" in Cargo.toml and "panic": "abort" in the .json target file, which is wrong according the page).

Overall this so far seems much cleaner and more enjoyable than all the GRUB stuff from before, and the bootimage abstraction is great (though I look forward into reading about how it works). One comment is that I would recommend explaining a bit more deeply what exactly bootimage does. Currently I feel like it:

  1. Builds my project with xargo.
  2. Does magic to make it bootable.

I suppose that it downloads the bootloader, and then somehow combines it with the build executable from xargo to automatically make an image that includes both the bootloader and the executable, but how does the bootloader find the entry point of our executable? I know the point is not to get distracted with too much bootloader stuff in this post, and keeping bootimage a bit magic is fine. It would be great to peek just a bit inside the magic though.

Great work on all of this so far!

@donald-pinckney Thanks a lot for the feedback!

Overall this so far seems much cleaner and more enjoyable than all the GRUB stuff from before

Awesome! Glad to hear that!

One comment is that I would recommend explaining a bit more deeply what exactly bootimage does.

Very good point. I definitely don't want the posts to become too magical!

how does the bootloader find the entry point of our executable?

It first loads the kernel ELF file into memory and parses it. It then maps each segment to the specified virtual address in a new page table (including the correct read/write/execute bits). The entry point address is stored in the ELF file too, so the bootloader can just jump to it (after doing a address space switch to the new page table).

I try to add a similar high-level explanation to the post.

Is there a clean way that a reader of the series can start from the second edition and transition into implementing features from the first edition of the blog?

It seems like that could be an effective holdover until the entire second edition is completed, since there's already so much valuable content!

@donald-pinckney I added a short explanation of the bootimage tool in #393 .

@LPGhatguy Most of it should work similarly. There are a few differences though, that make it a bit more difficult to follow. For example, we use our own bootinfo instead of the one from multiboot, so the corresponding part of the “Allocating Frames” post would work differently. Maybe we could explain the differences somewhere, so that users can continue with the content from the first edition…

PR for the new “Printing to Screen” post is up: #394

LLD is shipped with today's nightly 🎉! See #395 on how to use it. There is still an issue on macOS: rust-lang/rust#48772. Maybe someone has ideas how to fix it.

My current plan: Setup & printing first, then exceptions and hardware interrupts (including keyboard and timer support, and maybe even multithreading), then memory management

Big 💯 on this; this is the path I'm taking for intermezzOS as well. To me, it's not so much about a jump in difficulty, it's about getting to something interesting as fast as possible.

I'm slightly confused why the A Freestanding Rust Binary blog post talks about freestanding binaries on Windows. That specific bit of information seems a bit useless for writing a kernel, since I doubt PE/COFF binaries would work for that purpose. If you're just going to later use a target spec to create freestanding ELF binaries for kernel purposes, then defining Windows specific entry points seems like it wouldn't be needed.

Hm, so it's not finding lld on Windows:

> bootimage --target intermezzos
   Compiling core v0.0.0 (file:///C:/Users/steve/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/src/libcore)
    Finished release [optimized] target(s) in 32.81 secs
   Compiling intermezzos v0.1.0 (file:///C:/Users/steve/src/intermezzos/kernel-ng)
error: linker `ld.lld` not found
  |
  = note: The system cannot find the file specified. (os error 2)

> rustc --version
rustc 1.26.0-nightly (c9334404f 2018-03-05)

That is the right nightly.... hm.

So, the above is because of a bug in the text vs the .json in the repo; I'll send in a PR. Now failing with

error: linker `lld` not found

The plot thickens:

> ls C:\Users\steve\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\bin
Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----         3/5/2018   7:29 PM       40011776 lld.exe

so it is there, but I guess this isn't in PATH?

@steveklabnik Thanks for the PR!

so it is there, but I guess this isn't in PATH?

Hmm, could be. Strange that it works on appveyor, but throws the same error on travis for macOS…

@retep998 Thanks for the feedback!

If you're just going to later use a target spec to create freestanding ELF binaries for kernel purposes, then defining Windows specific entry points seems like it wouldn't be needed.

It isn't needed. I just wanted that Windows and macOS users have something that successfully compiles at the end of the post. But I'm open to other variants, if you have any ideas.

Hmm, could be. Strange that it works on appveyor…

Yeah, this is very confusing.

Maybe you need to restart your shell or something?

No effect.

@phil-opp appveyor is using pc-windows-gnu which has MinGW binaries in the same directory as lld.exe. @steveklabnik is using pc-windows-msvc which only has lld.exe so it's possible that there is special handling only for the pc-windows-gnu target to add that directory to PATH but not for pc-windows-msvc. Restarting the shell should be irrelevant as that directory would never be part of your normal PATH but rather dynamically added when rustc invokes the linker.

@retep998 Seems like I posted the wrong link above. Appveyor is set up to test on msvc as well and it's green too: https://ci.appveyor.com/project/phil-opp/blog-os/build/1.0.134

In that case I have absolutely no idea.

so i installed the mingw version just to ensure that that worked, and

> rustup run nightly-x86_64-pc-windows-gnu bootimage --target intermezzos
   Compiling core v0.0.0 (file:///C:/Users/steve/.rustup/toolchains/nightly-x86_64-pc-windows-gnu/lib/rustlib/src/rust/src/libcore)
    Finished release [optimized] target(s) in 37.10 secs
   Compiling intermezzos v0.1.0 (file:///C:/Users/steve/src/intermezzos/kernel-ng)
error: linker `lld` not found
  |
  = note: The system cannot find the file specified. (os error 2)

So, now i have no idea.

- set PATH=C:\msys64\mingw%MSYS_BITS%\bin;C:\msys64\usr\bin;%PATH% in .appveyor.yml is making me suspicious that appveyor is only succeeding because there is an LLD somewhere in msys2 that is being picked up by rust. Is there anything you actually need from those directories?

It's needed for the gnu targets because we transitively depend on libz-sys, which tries to use tools such as make in its build script on gnu targets. Here is an earlier build that only adds .cargo/bin to the PATH: https://ci.appveyor.com/project/phil-opp/blog-os/build/1.0.131 (the corresponding .appveyor.yml). It still succeeds for the mvsc targets.

(I tried to only modify the PATH for the gnu builds through conditional compilation, but I couldn't get it to work.)

Ok, I uninstalled all versions of lld on my local linux system and tried to build it with the included lld. Now I get the same error. So at least we get the the same error on all OSs now. However, I still have no idea why the CI succeeds and how to fix it.

Add - ps: Get-Command lld to the .appveyor.yml. If it succeeds then that explains everything. If it fails then I'll be even more confused.

alex says this is supposed to work, or at least, it is intended to. he's trying to repro locally now.

The output:

Get-Command lld

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     lld.exe                                            0.0.0.0    C:\Program Files\LLVM\bin\lld.exe

And on travis:

$ lld --version
lld is a generic driver.
Invoke ld.lld (Unix), ld (macOS) or lld-link (Windows) instead.

Oh well, seems like appveyor and travis have it in their default path now. I had an earlier build where it couldn't find lld, but it seems like I just used the wrong command (ld.lld instead of lld -flavor gnu). Anyway, thanks for the idea @retep998!

@steveklabnik

he's trying to repro locally now.

Awesome! Let me know if I can help somewhere!

I have good news and bad news. Good news: this is intended to work!

Bad news:

15:09 <@acrichto> oh
15:09 <@acrichto> it's the sysroot error
15:09 <@acrichto> b/c xargo is changing the sysroot
15:09 <@acrichto> so rustc can't find the original lld
15:10 <@steveklabnik> ..... that makes perfect sense
15:10 <@steveklabnik> dangit
15:10 <@acrichto> this isn't too bad though
15:10 <@acrichto> you can just symlink directories
15:10 <@acrichto> so xargo makes a sysroot w/ your target
15:10 <@acrichto> and it can also symlink in the target for the host arch
15:11 <@steveklabnik> is this one of those things that gets fixed by unforking xargo, since it's now in a
                      semi-abandoned state?
15:11 <@acrichto> well in theory everything gets fixed because anything is possible :P
15:12 <@steveklabnik> hehe
15:13 <@steveklabnik> i guess i mean, 1. i guess this is expected, 2. is there any current plan for a fix, since it
                      sounds like this is a known thing
15:13 <@steveklabnik> i bet i can get someone to work on it if i can know what it is and what's likely to be accepted
                      :)
15:14 <@acrichto> I'm not sure we can fix in rustc :(
15:14 <@acrichto> but maybe we can try?
15:14 <@acrichto> have an array of sysroots to try?
15:14 <@steveklabnik> hrm
15:14 <@acrichto> certainly should be possible
15:15 <@steveklabnik> given that japaric is open to merging xargo PRs, not just working on them
15:15 <@steveklabnik> i wonder if doing that is easier/cleaner

So yeah. Not sure what strategy we should pursue here.

Thanks! Makes perfect sense.

The only thing I don't quite understand is why lld is part of the sysroot in the first place. It's target independent, so why not put it along rustc or cargo (e.g. in a subfolder that isn't included in the PATH)?

I'm going to revert #395 for now until we have a solution.

The only thing I don't quite understand is why lld is part of the sysroot in the first place. It's target independent, so why not put it along rustc or cargo (e.g. in a subfolder that isn't included in the PATH)?

alex says:

15:46 <@acrichto> steveklabnik: we don't want to clash with system lld
15:46 <@acrichto> steveklabnik: and we also don't necesarily want to stabilize the ability to get lld yet
15:47 <@acrichto> it's still sort of an internal implementation detail
15:47 <@acrichto> which we'd ideally reserve the right to change

The issue was fixed in japaric/xargo#200, so I re-applied the LLD change in #400. I also wrote a news post and mentioned the second edition on the front page in #401.

Hey all! About the emulator. I had an idea. Could we use v86, run something like Tiny Core Linux, and then use that to compile the source from the part of the post. Inside of the VM, we could run qemu-system-x86_64 on that build code, and output the build log and QEmu output to the user. Is this possible? I'm starting to work on something like this, but I'll report my findings here as I do them...

That sounds like it would be very slow, but it could work.

Sounds interesting!

Just tested the Docker image, it appears to work from Windows 10 Pro but I had to figure it out and remove the entrypoint from the Dockerfile because I was getting error message supposedly about linebreaks difference in dos/unix. But at the end it booted blog_os in the container successfully(!)

cd docker

docker build -t blog_os .

docker run -it blog_os bash

cargo install cargo-xbuild bootimage

...wait a while...

Summary Successfully installed cargo-xbuild, bootimage!

now clone the blog_os

git clone https://github.com/phil-opp/blog_os.git

bootimage build
apt update
apt install qemu

qemu-system-x86_64 -curses -drive format=raw,file=bootimage-blog_os.bin

Success:

helloworld

@montao Thanks for testing! I marked it as done in the issue description.

With the release of the Allocator Designs post, we can finally tick the last missing item from this issue:

“Kernel Heap”: include more allocator types

At this point, all the content from the first edition that I wanted to keep is now also available in some form in the second edition. So I think we can finally close this issue :).

Re "javascript emulator":

unicorn.js seems to support x86_64. I believe it doesn't support any devices out of the box though. Unicorn does allow you to modify all internal state, so it should be possible to emulate devices.

Interesting! I don't have the time to look into this at the moment, but I'll keep it in mind.