phil-opp / blog_os

Writing an OS in Rust

Home Page:http://os.phil-opp.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Double Fault in Interrupts post in latest nightly

phil-opp opened this issue · comments

There is a weird double fault bug when running the post-07 branch (or newer) on the latest Rust nightly:

image

I don't know the cause of this issue yet, but it seems related to the recent LLVM 18. I recommend to use an older nightly until we figured this out. You can do that by running these commands:

> rustup override set nightly-2024-02-01
> rustup component add rust-src llvm-tools-preview --toolchain nightly-2024-02-01

Then everything should work again.

(To remove the override later, run rustup override unset.)

Some observations:

  • The weird interrupt frame stack in the output above indicates that the x86-interrupt calling convention is currently broken on the latest LLVM version. The values are all shifted by one field. For example, 0x8 is the code segment, not the instruction pointer. And 518 are the CPU flags, 0x10000201f48 is the stack pointer and the stack segment is 0.
  • Cause of the double fault:
    • Using QEMU's -d int flag, I saw that the double fault occurs directly, without any other exception before.
    • By disassembling the kernel binary and grepping for the VirtAddr, I discovered that the double fault occurs on the hlt instruction.
    • This indicates that the double fault is not a CPU expection at all, but a misinterpreted interrupt from the PIC.
      • If I keep interrupts disabled, the double fault disappears.
      • If I remove the PIC remapping, the same double fault occurs. Now the fault is expected because the PIC is mapped to the same IDT entries as the CPU exceptions by default.
    • So the issue seems to be that the PIC remapping doesn't work as expected.

It looks like the issue is that the .data section is not correctly loaded. It is all zero after loading for some reason.

As a result, the PIC initialization fails, which cause the double fault once interrupts are enabled. Re-initializing the PIC at runtime before initialization seems to fix the issue:

    let mut pics = interrupts::PICS.lock();
    *pics =
        unsafe { pic8259::ChainedPics::new(interrupts::PIC_1_OFFSET, interrupts::PIC_2_OFFSET) };
    unsafe { pics.initialize() };

So the issue seems to be the loading of the static, which is part of the .data section.

I found the issue and fixed it in rust-osdev/bootloader#424.

If you're affected by this, run cargo update -p bootloader.

I change the older nightly version but still got this problems.

$ rustc --version --verbose
rustc 1.77.0-nightly (11f32b73e 2024-01-31)
binary: rustc
commit-hash: 11f32b73e0dc9287e305b5b9980d24aecdc8c17f
commit-date: 2024-01-31
host: x86_64-unknown-linux-gnu
release: 1.77.0-nightly
LLVM version: 17.0.6

image

I use the WSL2 in the ubuntu 22.04 in x86 computer.

I change the older nightly version but still got this problems.

$ rustc --version --verbose
rustc 1.77.0-nightly (11f32b73e 2024-01-31)
binary: rustc
commit-hash: 11f32b73e0dc9287e305b5b9980d24aecdc8c17f
commit-date: 2024-01-31
host: x86_64-unknown-linux-gnu
release: 1.77.0-nightly
LLVM version: 17.0.6

image

I use the WSL2 in the ubuntu 22.04 in x86 computer.

Sorry, it turn out that i keep x86_64::instructions::interrupts::int3(); in my _start function, and i think it cause the double fault. Everything works well when i remove this statement.